Finding all frequent patterns starting from the closure

Authors:
Mohammad El-Hajj;Osmar R. Zaïane
Affiliations:
Department of Computing Science, University of Alberta, Edmonton, AB, Canada;Department of Computing Science, University of Alberta, Edmonton, AB, Canada
Venue:
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Year:
2005

Citing 5
Cited 1

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Real world performance of association rule algorithms

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Frequent closed itemset based algorithms: a thorough structural and analytical survey

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of data mining. Although many efficient frequent-pattern mining techniques have been developed in the last decade, most of them assume relatively small databases, leaving extremely large but realistic datasets out of reach. A practical and appealing direction is to mine for closed itemsets. These are subsets of all frequent patterns but good representatives since they eliminate what is known as redundant patterns. In this paper we introduce an algorithm to discover closed frequent patterns efficiently in extremely large datasets. Our implementation shows that our approach outperforms similar state-of-the-art algorithms when mining extremely large datasets by at least one order of magnitude in terms of both execution time and memory usage.