Selecting the right objective measure for association analysis

Authors:
Pang-Ning Tan;Vipin Kumar;Jaideep Srivastava
Affiliations:
Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN;Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN;Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN
Venue:
Information Systems - Knowledge discovery and data mining (KDD 2002)
Year:
2004

Citing 22
Cited 61

Rule induction with CN2: some recent improvements

EWSL-91 Proceedings of the European working session on learning on Machine learning
Elements of information theory

Elements of information theory
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A new framework for itemset generation

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Using association rules for product assortment decisions: a case study

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the stock market (extended abstract): which measure is best?

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Principles of data mining

Principles of data mining
Empirical bayes screening for multi-item associations

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Information Retrieval

Information Retrieval
Knowledge Discovery and Measures of Interest

Knowledge Discovery and Measures of Interest
Computer Solution of Large Sparse Positive Definite

Computer Solution of Large Sparse Positive Definite
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules

Data Mining and Knowledge Discovery
Database Mining: A Performance Perspective

IEEE Transactions on Knowledge and Data Engineering
TopCat: Data Mining for Topic Identification in a Text Corpus

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Ranking the Interestingness of Summaries from Data Mining Systems

Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On biases in estimating multi-valued attributes

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Proposed interestingness measure for characteristic rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

Relative risk and odds ratio: a data mining perspective

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A relatedness-based data-driven approach to determination of interestingness of association rules

Proceedings of the 2005 ACM symposium on Applied computing
Mining risk patterns in medical data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Using Information-Theoretic Measures to Assess Association Rule Interestingness

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Enhancing Data Analysis with Noise Removal

IEEE Transactions on Knowledge and Data Engineering
On Optimal Rule Discovery

IEEE Transactions on Knowledge and Data Engineering
Generalizing the notion of confidence

Knowledge and Information Systems
Evaluating generalized association rules through objective measures

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Mining unexpected multidimensional rules

Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Unifying Framework for Rule Semantics: Application to Gene Expression Data

Fundamenta Informaticae - Special issue ISMIS'05
Mining typical patterns from databases

Information Sciences: an International Journal
Discrimination-aware data mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Performance Measures in Classification of Human Communications

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Comparing Rule Measures for Predictive Association Rules

ECML '07 Proceedings of the 18th European conference on Machine Learning
Relative Linkage Disequilibrium: A New Measure for Association Rules

ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
Mapping General-Specific Noun Relationships to WordNet Hypernym/Hyponym Relations

EKAW '08 Proceedings of the 16th international conference on Knowledge Engineering: Practice and Patterns
Evaluating ontology mapping techniques: An experiment in public safety information sharing

Decision Support Systems
On Optimal Rule Mining: A Framework and a Necessary and Sufficient Condition of Antimonotonicity

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A systematic analysis of performance measures for classification tasks

Information Processing and Management: an International Journal
Finding the Most Interesting Association Rules by Aggregating Objective Interestingness Measures

Knowledge Acquisition: Approaches, Algorithms and Applications
Fully unsupervised graph-based discovery of general-specific noun relationships from web corpora frequency counts

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Semantic-based pruning of redundant and uninteresting frequent geographic patterns

Geoinformatica
Data mining for discrimination discovery

ACM Transactions on Knowledge Discovery from Data (TKDD)
A new and useful syntactic restriction on rule semantics for tabular datasets

ICFCA'07 Proceedings of the 5th international conference on Formal concept analysis
Data mining for web personalization

The adaptive web
A framework for discovering and analyzing changing customer segments

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
A study on interestingness measures for associative classifiers

Proceedings of the 2010 ACM Symposium on Applied Computing
O3R: Ontology-based mechanism for a human-centered environment targeted at the analysis of navigation patterns

Knowledge-Based Systems
Two measures of objective novelty in association rule mining

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Using ontologies to facilitate post-processing of association rules by domain experts

Information Sciences: an International Journal
Mining interestingness measures for string pattern mining

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
Finding association rules in semantic web data

Knowledge-Based Systems
Mining interestingness measures for string pattern mining

Knowledge-Based Systems
Mining interesting infrequent and frequent itemsets based on minimum correlation strength

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
Mining classification rules without support: an anti-monotone property of Jaccard measure

DS'11 Proceedings of the 14th international conference on Discovery science
Evaluating interestingness measures with linear correlation graph

IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Towards ad-hoc rule semantics for gene expression data

ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Intelligent techniques for web personalization

ITWP'03 Proceedings of the 2003 international conference on Intelligent Techniques for Web Personalization
A data analysis approach for evaluating the behavior of interestingness measures

DS'05 Proceedings of the 8th international conference on Discovery Science
Mining top-k sequential rules

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Selecting an appropriate interestingness measure to evaluate the correlation between syndrome elements and symptoms

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Integrating quantitative attributes in hierarchical clustering of transactional data

KES-AMSTA'12 Proceedings of the 6th KES international conference on Agent and Multi-Agent Systems: technologies and applications
Properties of rule interestingness measures and alternative approaches to normalization of measures

Information Sciences: an International Journal
Unifying Framework for Rule Semantics: Application to Gene Expression Data

Fundamenta Informaticae - Special issue ISMIS'05
Frequent item set mining

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Clustering transactions with an unbalanced hierarchical product structure

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Optimonotone Measures For Optimal Rule Discovery

Computational Intelligence
Interestingness measures for classification based on association rules

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part II
Association Rules Evaluation by a Hybrid Multiple Criteria Decision Method

International Journal of Knowledge and Systems Science
Closeness Preference - A new interestingness measure for sequential rules mining

Knowledge-Based Systems
Confirmation measures of association rule interestingness

Knowledge-Based Systems
Speeding up correlation search for binary data

Pattern Recognition Letters
Association rule mining using binary particle swarm optimization

Engineering Applications of Artificial Intelligence
Developing a data mining approach to investigate association between physician prescription and patient outcome - A study on re-hospitalization in Stevens-Johnson Syndrome

Computer Methods and Programs in Biomedicine
Formal and computational properties of the confidence boost of association rules

ACM Transactions on Knowledge Discovery from Data (TKDD)
From Association Analysis to Causal Discovery

Proceedings of Workshop on Machine Learning for Sensory Data Analysis
Assessment of data quality in accounting data with association rules

Expert Systems with Applications: An International Journal
A hybrid heuristic approach for attribute-oriented mining

Decision Support Systems
Optimal leverage association rules with numerical interval conditions

Intelligent Data Analysis
Interestingness measures for association rules within groups

Intelligent Data Analysis
Behavior-based clustering and analysis of interestingness measures for association rule mining

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective measures such as support, confidence, interest factor, correlation, and entropy are often used to evaluate the interestingness of association patterns. However, in many situations, these measures may provide conflicting information about the interestingness of a pattern. Data mining practitioners also tend to apply an objective measure without realizing that there may be better alternatives available for their application. In this paper, we describe several key properties one should examine in order to select the right measure for a given application. A comparative study of these properties is made using twenty-one measures that were originally developed in diverse fields such as statistics, social science, machine learning, and data mining. We show that depending on its properties, each measure is useful for some application, but not for others. We also demonstrate two scenarios in which many existing measures become consistent with each other, namely, when support-based pruning and a technique known as table standardization are applied. Finally, we present an algorithm for selecting a small set of patterns such that domain experts can find a measure that best fits their requirements by ranking this small set of patterns.