Efficient mining of frequent itemsets in distorted databases

Authors:
Jinlong Wang;Congfu Xu
Affiliations:
Institute of Artificial Intelligence, Zhejiang University, Hangzhou, China;Institute of Artificial Intelligence, Zhejiang University, Hangzhou, China
Venue:
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Year:
2006

Citing 9
Cited 0

Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic

Fuzzy Sets and Systems - Special issue: fuzzy sets: where do we stand? Where do we go?
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining frequent patterns with counting inference

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Data mining using granular computing: fast algorithms for finding association rules

Data mining, rough sets and granular computing
Granular computing using information tables

Data mining, rough sets and granular computing
Using randomized response techniques for privacy-preserving data mining

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Maintaining data privacy in association rule mining

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
An improved EMASK algorithm for privacy-preserving frequent pattern mining

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, the data perturbation approach has been applied to data mining, where original data values are modified such that the reconstruction of the values for any individual transaction is difficult. However, this mining in distorted databases brings enormous overheads as compared to normal data sets. This paper presents an algorithm GrC-FIM, which introduces granular computing (GrC), to address the efficiency problem of frequent itemset mining in distorted databases. Using the key granule concept and granule inference, support counts of candidate non-key frequent itemsets can be inferred with the counts of their frequent sub-itemsets obtained during an earlier mining. This eliminates the tedious support reconstruction for these itemsets. And the accuracy is improved in dense data sets while that in sparse ones is the same.