Efficient mining of frequent itemsets in distorted databases

  • Authors:
  • Jinlong Wang;Congfu Xu

  • Affiliations:
  • Institute of Artificial Intelligence, Zhejiang University, Hangzhou, China;Institute of Artificial Intelligence, Zhejiang University, Hangzhou, China

  • Venue:
  • AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, the data perturbation approach has been applied to data mining, where original data values are modified such that the reconstruction of the values for any individual transaction is difficult. However, this mining in distorted databases brings enormous overheads as compared to normal data sets. This paper presents an algorithm GrC-FIM, which introduces granular computing (GrC), to address the efficiency problem of frequent itemset mining in distorted databases. Using the key granule concept and granule inference, support counts of candidate non-key frequent itemsets can be inferred with the counts of their frequent sub-itemsets obtained during an earlier mining. This eliminates the tedious support reconstruction for these itemsets. And the accuracy is improved in dense data sets while that in sparse ones is the same.