An efficient approach to mining indirect associations
Journal of Intelligent Information Systems
Data & Knowledge Engineering
Efficient mining of high utility itemsets from large datasets
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
From concepts to concept lattice: a border algorithm for making covers explicit
ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
Identifying patterns in learner's behavior Using Markov chains and n-gram models
CSCC'11 Proceedings of the 2nd international conference on Circuits, Systems, Communications & Computers
Scaling up cosine interesting pattern discovery: A depth-first method
Information Sciences: an International Journal
Hi-index | 0.01 |
Mining frequent patterns from large databases plays an essential role in many data mining tasks and has broad applications. Most of the previously proposed methods adopt apriori-like candidate-generation-and-test approaches. However, those methods may encounter serious challenges when mining datasets with prolific patterns and/or long patterns. In this work, we develop a class of novel and efficient pattern-growth methods for mining various frequent patterns from large databases. Pattern-growth methods adopt a divide-and-conquer approach to decompose both the mining tasks and the databases. Then, they use a pattern fragment growth method to avoid the costly candidate-generation-and-test processing completely. Moreover, effective data structures are proposed to compress crucial information about frequent patterns and avoid expensive, repeated database scans. A comprehensive performance study shows that pattern-growth methods, FP-growth and H-mine, are efficient and scalable. They are faster than some recently reported new frequent pattern mining methods. Interestingly, pattern growth methods are not only efficient, but also effective. With pattern growth methods, many interesting patterns can also be mined efficiently, such as patterns with some tough non-anti-monotonic constraints and sequential patterns. These techniques have strong implications to many other data mining tasks.