Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A new framework for itemset generation
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Discovering Frequent Closed Itemsets for Association Rules
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Interestingness of frequent itemsets using Bayesian networks as background knowledge
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Reasoning about sets using redescription mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Summarizing itemset patterns using probabilistic models
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining and Knowledge Discovery
Assessing data mining results via swap randomization
ACM Transactions on Knowledge Discovery from Data (TKDD)
Banded structure in binary matrices
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
MINI: Mining Informative Non-redundant Itemsets
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Decomposable Families of Itemsets
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Maximum entropy based significance of itemsets
Knowledge and Information Systems
Tell me something I don't know: randomization strategies for iterative data mining
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Correlated itemset mining in ROC space: a constraint programming approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
Computational complexity of queries based on itemsets
Information Processing Letters
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Using background knowledge to rank itemsets
Data Mining and Knowledge Discovery
Summarising data by clustering items
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Comparing apples and oranges: measuring differences between data mining results
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Multi-document summarization exploiting frequent itemsets
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Where do I start?: algorithmic strategies to guide intelligence analysts
Proceedings of the ACM SIGKDD Workshop on Intelligence and Security Informatics
Finding minimum representative pattern sets
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
An enhanced relevance criterion for more concise supervised pattern discovery
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
On nested palindromes in clickstream data
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Interactive pattern mining on hidden data: a sampling-based solution
Proceedings of the 21st ACM international conference on Information and knowledge management
Discovering descriptive tile trees: by mining optimal geometric subtiles
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Summarizing categorical data by clustering attributes
Data Mining and Knowledge Discovery
Misleading Generalized Itemset discovery
Expert Systems with Applications: An International Journal
Behavior-based clustering and analysis of interestingness measures for association rule mining
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Data analysis is an inherently iterative process. That is, what we know about the data greatly determines our expectations, and hence, what result we would find the most interesting. With this in mind, we introduce a well-founded approach for succinctly summarizing data with a collection of itemsets; using a probabilistic maximum entropy model, we iteratively find the most interesting itemset, and in turn update our model of the data accordingly. As we only include itemsets that are surprising with regard to the current model, the summary is guaranteed to be both descriptive and non-redundant. The algorithm that we present can either mine the top-k most interesting itemsets, or use the Bayesian Information Criterion to automatically identify the model containing only the itemsets most important for describing the data. Or, in other words, it will 'tell you what you need to know'. Experiments on synthetic and benchmark data show that the discovered summaries are succinct, and correctly identify the key patterns in the data. The models they form attain high likelihoods, and inspection shows that they summarize the data well with increasingly specific, yet non-redundant itemsets.