Maintenance of maximal frequent itemsets in large databases

  • Authors:
  • Wang Lian;David W. Cheung;S. M. Yiu

  • Affiliations:
  • Macao University of Science and Technology, Avenida Wai Long, Macao;University of Hong Kong, Pokfulam Road, Hong Kong;University of Hong Kong, Pokfulam Road, Hong Kong

  • Venue:
  • Proceedings of the 2007 ACM symposium on Applied computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

There have been many studies on efficient discovery of maximal frequent itemsets in large databases. However, it is nontrivial to maintain such discovered itemsets if more and more data is inserted into the database as the insertions may invalidate some existing maximal frequent itemsets and also create some new ones. In this paper, we clearly address the relationships between old and new maximal frequent itemsets and propose an algorithm IMFI, which is based on these relationships to reuse previously discovered knowledge. The algorithm follows a top-down mechanism rather than traditional bottom-up methods to produce fewer candidates. Moreover, we integrate SG-tree into IMFI to improve the counting efficiency, which is faster than those methods based on vertical bitmap database representation. Evaluations on IMFI have been performed using both synthetic and real databases. Preliminary results show that applying IMFI is always much faster than an available incremental MFI mining algorithm, especially when it is equipped with SG-tree.