Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Efficient discovery of error-tolerant frequent itemsets in high dimensions
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Support envelopes: a technique for exploring the structure of association patterns
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering frequent itemsets by support approximation and itemset clustering
Data & Knowledge Engineering
Index-BitTableFI: An improved algorithm for mining frequent itemsets
Knowledge-Based Systems
Quantitative evaluation of approximate frequent pattern mining algorithms
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
RAM: Randomized Approximate Graph Mining
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Towards efficient mining of proportional fault-tolerant frequent itemsets
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Agglomerating local patterns hierarchically with ALPHA
Proceedings of the 18th ACM conference on Information and knowledge management
ABBA: adaptive bicluster-based approach to impute missing values in binary matrices
Proceedings of the 2010 ACM Symposium on Applied Computing
An efficient polynomial delay algorithm for pseudo frequent itemset mining
DS'07 Proceedings of the 10th international conference on Discovery science
Ambiguous frequent itemset mining and polynomial delay enumeration
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Significance and recovery of block structures in binary matrices with noise
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Hi-index | 0.00 |
Frequent itemset mining is a popular and important first step in analyzing data sets across a broad range of applications. The traditional, "exact" approach for finding frequent itemsets requires that every item in the itemset occurs in each supporting transaction. However, real data is typically subject to noise, and in the presence of such noise, traditional itemset mining may fail to detect relevant itemsets, particularly those large itemsets that are more vulnerable to noise. In this paper we propose approximate frequent itemsets (AFI), as a noise-tolerant itemset model. In addition to the usual requirement for sufficiently many supporting transactions, the AFI model places constraints on the fraction of errors permitted in each item column and the fraction of errors permitted in a supporting transaction. Taken together, these constraints winnow out the approximate itemsets that exhibit systematic errors. In the context of a simple noise model, we demonstrate that AFI is better at recovering underlying data patterns, while identifying fewer spurious patterns than either the exact frequent itemset approach or the existing error tolerant itemset approach of Yang et al. [11].