Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Hash based parallel algorithms for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Fast Parallel Association Rule Mining without Candidacy Generation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining of Association Rules in Very Large Databases: A Structured Parallel Approach
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
IEEE Transactions on Knowledge and Data Engineering
An efficient parallel and distributed algorithm for counting frequent sets
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Optimization of frequent itemset mining on multiple-core processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Parallel frequent itemset mining using systolic arrays
Knowledge-Based Systems
Novel parallel method for mining frequent patterns on multi-core shared memory systems
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
Hi-index | 0.00 |
In this paper, we present a tree-partition algorithm for parallel mining of frequent patterns. Our work is based on FP-Growth algorithm, which is constituted of tree-building stage and mining stage. The main idea is to build only one FP-Tree in the memory, partition it into several independent parts and distribute them to different threads. A heuristic algorithm is devised to balance the workload. Our algorithm can not only alleviate the impact of locks during the tree-building stage, but also avoid the overhead that do great harm to the mining stage. We present the experiments on different kinds of datasets and compare the results with other parallel approaches. The results suggest that our approach has great advantage in efficiency, especially on certain kinds of datasets. As the number of processors increases, our parallel algorithm shows good scalability.