Tree partition based parallel frequent pattern mining on shared memory systems

Authors:
Dehao Chen;Chunrong Lai;Wei Hu;WenGuang Chen;Yimin Zhang;Weimin Zheng
Affiliations:
Tsinghua University, Dept. of Computer Science, Beijing, China;Intel Corporation, Intel China Research Center, Beijing, China;Intel Corporation, Intel China Research Center, Beijing, China;Tsinghua University, Dept. of Computer Science, Beijing, China;Intel Corporation, Intel China Research Center, Beijing, China;Tsinghua University, Dept. of Computer Science, Beijing, China
Venue:
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Year:
2006

Citing 8
Cited 3

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules

Advances in knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Hash based parallel algorithms for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Fast Parallel Association Rule Mining without Candidacy Generation

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining of Association Rules in Very Large Databases: A Structured Parallel Approach

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance

IEEE Transactions on Knowledge and Data Engineering
An efficient parallel and distributed algorithm for counting frequent sets

VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science

Optimization of frequent itemset mining on multiple-core processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Parallel frequent itemset mining using systolic arrays

Knowledge-Based Systems
Novel parallel method for mining frequent patterns on multi-core shared memory systems

DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a tree-partition algorithm for parallel mining of frequent patterns. Our work is based on FP-Growth algorithm, which is constituted of tree-building stage and mining stage. The main idea is to build only one FP-Tree in the memory, partition it into several independent parts and distribute them to different threads. A heuristic algorithm is devised to balance the workload. Our algorithm can not only alleviate the impact of locks during the tree-building stage, but also avoid the overhead that do great harm to the mining stage. We present the experiments on different kinds of datasets and compare the results with other parallel approaches. The results suggest that our approach has great advantage in efficiency, especially on certain kinds of datasets. As the number of processors increases, our parallel algorithm shows good scalability.