ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Stride directed prefetching in scalar processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
Fast Parallel Association Rule Mining without Candidacy Generation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Scalable Techniques for Mining Causal Structures
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Efficient Mining of Partial Periodic Patterns in Time Series Database
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
IEEE Transactions on Knowledge and Data Engineering
Improving database performance on simultaneous multithreading processors
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An efficient parallel and distributed algorithm for counting frequent sets
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Tree partition based parallel frequent pattern mining on shared memory systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Pfp: parallel fp-growth for query recommendation
Proceedings of the 2008 ACM conference on Recommender systems
Frequent itemset mining on graphics processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Cache-conscious buffering for database operators with state
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Memory-efficient frequent-itemset mining
Proceedings of the 14th International Conference on Extending Database Technology
Parallel skyline computation on multicore architectures
Information Systems
Mapping data mining algorithms on a GPU architecture: a study
ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
A parallel algorithm for computing borders
Proceedings of the 20th ACM international conference on Information and knowledge management
Optimization of query processing with cache conscious buffering operator
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce
Proceedings of the 21st ACM international conference on Information and knowledge management
GPU acceleration of probabilistic frequent itemset mining from uncertain databases
Proceedings of the 21st ACM international conference on Information and knowledge management
A parallel association-rule mining algorithm
WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
Parallel approaches to machine learning-A comprehensive survey
Journal of Parallel and Distributed Computing
Scalable frequent itemset mining on many-core processors
Proceedings of the Ninth International Workshop on Data Management on New Hardware
Efficient mining of frequent itemsets in social network data based on MapReduce framework
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Novel parallel method for mining frequent patterns on multi-core shared memory systems
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
Accelerating frequent itemset mining on graphics processing units
The Journal of Supercomputing
Hi-index | 0.00 |
Multi-core processors are proliferated across different domains in recent years. In this paper, we study the performance of frequent pattern mining on a modern multi-core machine. A detailed study shows that, even with the best implementation, current FP-tree based algorithms still under-utilize a multi-core system due to poor data locality and insufficient parallelism expression. We propose two techniques: a cache-conscious FP-array (frequent pattern array) and a lock-free dataset tiling parallelization mechanism to address this problem. The FP-array efficiently improves the data locality performance, and makes use of the benefits from hardware and software prefetching. The result yields an overall 4.0 speedup compared with the state-of-the-art implementation. Furthermore, to unlock the power of multi-core processor, a lock-free parallelization approach is proposed to restructure the FP-tree building algorithm. It not only eliminates the locks in building a single FP-tree with fine-grained threads, but also improves the temporal data locality performance. To summarize, with the proposed cache-conscious FP-array and lock-free parallelization enhancements, the overall FP-tree algorithm achieves a 24 fold speedup on an 8-core machine. Finally, we believe the presented techniques can be applied to other data mining tasks as well with the prevalence of multi-core processor.