Interestingness measures for association rules based on statistical validity

Authors:
Izwan Nizal Mohd. Shaharanee;Fedja Hadzic;Tharam S. Dillon
Affiliations:
Digital Ecosystem and Business Intelligence Institute, Curtin University of Technology, Perth 6102, Australia;Digital Ecosystem and Business Intelligence Institute, Curtin University of Technology, Perth 6102, Australia;Digital Ecosystem and Business Intelligence Institute, Curtin University of Technology, Perth 6102, Australia
Venue:
Knowledge-Based Systems
Year:
2011

Citing 22
Cited 6

A Statistical-Heuristic Feature Selection Criterion for Decision Tree Induction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A new framework for itemset generation

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Data mining: concepts and techniques

Data mining: concepts and techniques
Constraint-Based Rule Mining in Large, Dense Databases

Data Mining and Knowledge Discovery
Detecting Group Differences: Mining Contrast Sets

Data Mining and Knowledge Discovery
A Statistical Theory for Quantitative Association Rules

Journal of Intelligent Information Systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Rule Evaluation Measures: A Unifying View

ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules on significant rare data using relative support

Journal of Systems and Software
On the discovery of significant statistical quantitative rules

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A survey of interestingness measures for knowledge discovery

The Knowledge Engineering Review
Using Information-Theoretic Measures to Assess Association Rule Interestingness

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Interestingness measures for data mining: A survey

ACM Computing Surveys (CSUR)
Novel measurement for mining effective association rules

Knowledge-Based Systems
Data Analysis in the 21st Century

Statistical Analysis and Data Mining
Statistical mining of interesting association rules

Statistics and Computing
Modeling interestingness of streaming association rules as a benefit-maximizing classification problem

Knowledge-Based Systems
Efficient Discovery of Statistically Significant Association Rules

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
An efficient rigorous approach for identifying statistically significant frequent itemsets

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Interestingness of Association Rules Using Symmetrical Tau and Logistic Regression

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence

Decision support for improved service effectiveness using domain aware text mining

Knowledge-Based Systems
Context Based Positive and Negative Spatio-Temporal Association Rule Mining

Knowledge-Based Systems
Closeness Preference - A new interestingness measure for sequential rules mining

Knowledge-Based Systems
Confirmation measures of association rule interestingness

Knowledge-Based Systems
BruteSuppression: a size reduction method for Apriori rule sets

Journal of Intelligent Information Systems
Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Assessing rules with interestingness measures is the pillar of successful application of association rules discovery. However, association rules discovered are normally large in number, some of which are not considered as interesting or significant for the application at hand. In this paper, we present a systematic approach to ascertain the discovered rules, and provide a precise statistical approach supporting this framework. The proposed strategy combines data mining and statistical measurement techniques, including redundancy analysis, sampling and multivariate statistical analysis, to discard the non- significant rules. Moreover, we consider real world datasets which are characterized by the uniform and non-uniform data/items distribution with a mixture of measurement levels throughout the data/items. The proposed unified framework is applied on these datasets to demonstrate its effectiveness in discarding many of the redundant or non-significant rules, while still preserving the high accuracy of the rule set as a whole.