Maintenance of maximal frequent itemsets in large databases

Authors:
Wang Lian;David W. Cheung;S. M. Yiu
Affiliations:
Macao University of Science and Technology, Avenida Wai Long, Macao;University of Hong Kong, Pokfulam Road, Hong Kong;University of Hong Kong, Pokfulam Road, Hong Kong
Venue:
Proceedings of the 2007 ACM symposium on Applied computing
Year:
2007

Citing 10
Cited 5

An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
An efficient algorithm to update large itemsets with early pruning

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Indexing Useful Structural Patterns for XML Query Processing

IEEE Transactions on Knowledge and Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm

IEEE Transactions on Knowledge and Data Engineering
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets

Data Mining and Knowledge Discovery

An efficient technique for incremental updating of association rules

International Journal of Hybrid Intelligent Systems
Mining uncertain data for frequent itemsets that satisfy aggregate constraints

Proceedings of the 2010 ACM Symposium on Applied Computing
Frequent itemset mining of uncertain data streams using the damped window model

Proceedings of the 2011 ACM Symposium on Applied Computing
Equivalence class transformation based mining of frequent itemsets from uncertain data

Proceedings of the 2011 ACM Symposium on Applied Computing
The augmented itemset tree: a data structure for online maximum frequent pattern mining

DS'11 Proceedings of the 14th international conference on Discovery science

Quantified Score

Hi-index	0.00

Visualization

Abstract

There have been many studies on efficient discovery of maximal frequent itemsets in large databases. However, it is nontrivial to maintain such discovered itemsets if more and more data is inserted into the database as the insertions may invalidate some existing maximal frequent itemsets and also create some new ones. In this paper, we clearly address the relationships between old and new maximal frequent itemsets and propose an algorithm IMFI, which is based on these relationships to reuse previously discovered knowledge. The algorithm follows a top-down mechanism rather than traditional bottom-up methods to produce fewer candidates. Moreover, we integrate SG-tree into IMFI to improve the counting efficiency, which is faster than those methods based on vertical bitmap database representation. Evaluations on IMFI have been performed using both synthetic and real databases. Preliminary results show that applying IMFI is always much faster than an available incremental MFI mining algorithm, especially when it is equipped with SG-tree.