Discovering Frequent Closed Itemsets for Association Rules
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
SUMMARY: Efficiently Summarizing Transactions for Clustering
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Summarization — Compressing Data into an Informative Representation
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Maximally informative k-itemsets and their efficient discovery
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
Frequent pattern mining: current status and future directions
Data Mining and Knowledge Discovery
Finding low-entropy sets and trees from binary data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Assessing data mining results via swap randomization
ACM Transactions on Knowledge Discovery from Data (TKDD)
Banded structure in binary matrices
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
The Chosen Few: On Identifying Valuable Patterns
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Preserving Privacy through Data Generation
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Tell me something I don't know: randomization strategies for iterative data mining
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A bi-clustering framework for categorical data
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Tell me what i need to know: succinctly summarizing data with itemsets
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Comparing apples and oranges: measuring differences between data mining results
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Summarizing categorical data by clustering attributes
Data Mining and Knowledge Discovery
Data summarization for network traffic monitoring
Journal of Network and Computer Applications
Hi-index | 0.00 |
For a book, the title and abstract provide a good first impression of what to expect from it. For a database, getting a first impression is not so straightforward. While low-order statistics only provide limited insight, mining the data quickly provides too much detail. In this paper we propose a middle ground, and introduce a parameter-free method for constructing high-quality summaries for binary data. Our method builds a summary by grouping items that strongly correlate, and uses the Minimum Description Length principle to identify the best grouping --without requiring a distance measure between items. Besides offering a practical overview of which attributes interact most strongly, these summaries are also easily-queried surrogates for the data. Experiments show that our method discovers high-quality results: correlated attributes are correctly grouped and the supports of frequent itemsets are closely approximated.