Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Probabilistic models in cluster analysis
Computational Statistics & Data Analysis - Special issue on classification
A new framework for itemset generation
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining the most interesting rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules with multiple minimum supports
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
Data mining for association rules and sequential patterns: sequential and parallel algorithms
Data mining for association rules and sequential patterns: sequential and parallel algorithms
Robust Classification for Imprecise Environments
Machine Learning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Empirical bayes screening for multi-item associations
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules
Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Advances in frequent itemset mining implementations: report on FIMI'03
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A Model-Based Frequency Constraint for Mining Associations from Transaction Data
Data Mining and Knowledge Discovery
TARtool: A Temporal Dataset Generator for Market Basket Analysis
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Models for association rules based on clustering and correlation
Intelligent Data Analysis
A study on interestingness measures for associative classifiers
Proceedings of the 2010 ACM Symposium on Applied Computing
The arules R-Package Ecosystem: Analyzing Interesting Patterns from Large Transaction Data Sets
The Journal of Machine Learning Research
Discovering frequent pattern pairs
Intelligent Data Analysis
Behavior-based clustering and analysis of interestingness measures for association rule mining
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Mining association rules is an important technique for discovering meaningful patterns in transaction databases. Many different measures of interestingness have been proposed for association rules. However, these measures fail to take the probabilistic properties of the mined data into account. We start this paper with presenting a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world database from a grocery outlet to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left hand side of rules and that lift performs poorly to filter random noise in transaction data. Based on the probabilistic framework we develop two new interest measures, hyper-lift and hyper-confidence, which can be used to filter or order mined association rules. The new measures show significantly better performance than lift for applications where spurious rules are problematic.