Efficiently mining frequent itemsets with compact FP-tree

Authors:
Qin Liang-Xi;Luo Ping;Shi Zhong-Zhi
Affiliations:
Key Lab of Intell. Info. Proc., Inst. of Comp. Technol., Ch. Acad. of Sci., Beijing and Grad. Sch. of Ch. Acad. of Sci., Beijing and Coll. of Comp. and Info. Eng., Guangxi Univ., Nanning;Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing and Graduate School of Chinese Academy of Sciences, Beijing;Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing
Venue:
Intelligent information processing II
Year:
2004

Citing 8
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

FP-growth algorithm is an efficient algorithm for mining frequent patterns. It scans database only twice and does not need to generate and test the candidate sets that is quite time consuming. The efficiency of the FP-growth algorithm outperforms previously developed algorithms. But, it must recursively generate huge number of conditional FP-trees that requires much more memory and costs more time.In this paper, we present an algorithm, CFPmine, that is inspired by several previous works. CFPmine algorithm combines several advantages of existing techniques. One is using constrained subtrees of a compact FP-tree to mine frequent pattern, so that it is doesn't need to construct conditional FP-trees in the mining process. Second is using an array-based technique to reduce the traverse time to the CFP-tree. And an unified memeory management is also implemented in the algorithm. The experimental evaluation shows that CFPmine algorithm is a high performance algorithm. It outperforms Apriori, Eclat and FP-growth and requires less memory than FP-growth.