An efficient frequent pattern mining algorithm

Authors:
Jun Tan;Yingyong Bu;Bo Yang
Affiliations:
College of Computer Science, Central South University of Forestry and Technology University, Changsha, Hunan Province, China;College of Mechanical and Electrical Engineering, Central South University, Changsha, Hunan Province, China;College of Mechanical and Electrical Engineering, Central South University, Changsha, Hunan Province, China
Venue:
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 2
Year:
2009

Citing 5
Cited 0

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Top Down FP-Growth for Association Rule Mining

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient algorithms for mining frequent itemsets are crucial for mining association rules and for other data mining tasks. FP-growth algorithm has been implemented using a prefix-tree structure, known as a FP-tree, for storing compressed frequency information. Numerous experimental results have demonstrated that the algorithm performs extremely well. But In FP-growth algorithm, two traversals of FP-tree are needed for constructing the new conditional FP-tree. In this paper we present a novel FP-array technique that greatly reduces the need to traverse FP-trees, thus obtaining significantly improved performance for FP-tree based algorithms. The technique works especially wen for sparse datasets. We then present a new algorithm which use the FP-tree data structure in combination with the FP-array technique efficiently and get the counts of frequent items from FP-array directly in order to omit the first scanning and save time. Experimental results show that the new algorithm outperform other algorithm in not only the speed of algorithms, but also their memory consumption and their scalability.