Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On the effective implementation of the iterative proportional fitting procedure
Computational Statistics & Data Analysis - Special issue dedicated to Toma´sˇ Havra´nek
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
A new framework for itemset generation
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
KDD-Cup 2000 organizers' report: peeling the onion
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Empirical bayes screening for multi-item associations
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Alternative Interest Measures for Mining Associations in Databases
IEEE Transactions on Knowledge and Data Engineering
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Pruning Redundant Association Rules Using Maximum Entropy Principle
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Computational complexity of queries based on itemsets
Information Processing Letters
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding low-entropy sets and trees from binary data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximum Entropy Based Significance of Itemsets
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Approximating the number of frequent sets in dense data
Knowledge and Information Systems
A framework for mining interesting pattern sets
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Using background knowledge to rank itemsets
Data Mining and Knowledge Discovery
Constructing classification features using minimal predictive patterns
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A concise representation of association rules using minimal predictive rules
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
A framework for mining interesting pattern sets
ACM SIGKDD Explorations Newsletter
An information theoretic framework for data mining
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Tell me what i need to know: succinctly summarizing data with itemsets
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximum entropy models and subjective interestingness: an application to tiles in binary databases
Data Mining and Knowledge Discovery
Ranking sequential patterns with respect to significance
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Pattern-based solution risk model for strategic IT outsourcing
ICDM'13 Proceedings of the 13th international conference on Advances in Data Mining: applications and theoretical aspects
Hi-index | 0.00 |
We consider the problem of defining the significance of an itemset. We say that the itemset is significant if we are surprised by its frequency when compared to the frequencies of its sub-itemsets. In other words, we estimate the frequency of the itemset from the frequencies of its sub-itemsets and compute the deviation between the real value and the estimate. For the estimation we use Maximum Entropy and for measuring the deviation we use Kullback–Leibler divergence. A major advantage compared to the previous methods is that we are able to use richer models whereas the previous approaches only measure the deviation from the independence model. We show that our measure of significance goes to zero for derivable itemsets and that we can use the rank as a statistical test. Our empirical results demonstrate that for our real datasets the independence assumption is too strong but applying more flexible models leads to good results.