Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
The budgeted maximum coverage problem
Information Processing Letters
Efficient discovery of error-tolerant frequent itemsets in high dimensions
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Top.K Frequent Closed Patterns without Minimum Support
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
TSP: Mining Top-K Closed Sequential Patterns
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Journal of the ACM (JACM)
Mining Approximate Frequent Itemsets from Noisy Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
Efficient mining of understandable patterns from multivariate interval time series
Data Mining and Knowledge Discovery
Quantitative evaluation of approximate frequent pattern mining algorithms
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Capturing truthiness: mining truth tables in binary datasets
Proceedings of the 2009 ACM symposium on Applied Computing
Association Analysis Techniques for Bioinformatics Problems
BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Towards efficient mining of proportional fault-tolerant frequent itemsets
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
An efficient polynomial delay algorithm for pseudo frequent itemset mining
DS'07 Proceedings of the 10th international conference on Discovery science
Efficient incremental mining of top-K frequent closed itemsets
DS'07 Proceedings of the 10th international conference on Discovery science
Ambiguous frequent itemset mining and polynomial delay enumeration
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Actionability and formal concepts: a data mining perspective
ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
Mining fault-tolerant item sets using subset size occurrence distributions
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Finding trees from unordered 0–1 data
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Significance and recovery of block structures in binary matrices with noise
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Mining a new fault-tolerant pattern type as an alternative to formal concept discovery
ICCS'06 Proceedings of the 14th international conference on Conceptual Structures: inspiration and Application
Constraint-Based mining of fault-tolerant patterns from boolean data
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A knowledge-driven bi-clustering method for mining noisy datasets
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Closed and noise-tolerant patterns in n-ary relations
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Frequent itemset mining has been the subject of a lot of work in data mining research ever since association rules were introduced. In this paper we address a problem with frequent itemsets: that they only count rows where all their attributes are present, and do not allow for any noise. We show that generalizing the concept of frequency while preserving the performance of mining algorithms is nontrivial, and introduce a generalization of frequent itemsets, dense itemsets. Dense itemsets do not require all attributes to be present at the same time; instead, the itemset needs to define a sufficiently large submatrix that exceeds a given density threshold of attributes present.We consider the problem of computing all dense itemsets in a database. We give a levelwise algorithm for this problem, and also study the top-$k$ variations, i.e., finding the k densest sets with a given support, or the k best-supported sets with a given density. These algorithms select the other parameter automatically, which simplifies mining dense itemsets in an explorative way. We show that the concept captures natural facets of data sets, and give extensive empirical results on the performance of the algorithms. Combining the concept of dense itemsets with set cover ideas, we also show that dense itemsets can be used to obtain succinct descriptions of large datasets. We also discuss some variations of dense itemsets.