A Statistical-Heuristic Feature Selection Criterion for Decision Tree Induction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A new framework for itemset generation
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Data mining: concepts and techniques
Data mining: concepts and techniques
Constraint-Based Rule Mining in Large, Dense Databases
Data Mining and Knowledge Discovery
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
A Statistical Theory for Quantitative Association Rules
Journal of Intelligent Information Systems
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Rule Evaluation Measures: A Unifying View
ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules on significant rare data using relative support
Journal of Systems and Software
On the discovery of significant statistical quantitative rules
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A survey of interestingness measures for knowledge discovery
The Knowledge Engineering Review
Using Information-Theoretic Measures to Assess Association Rule Interestingness
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Interestingness measures for data mining: A survey
ACM Computing Surveys (CSUR)
Novel measurement for mining effective association rules
Knowledge-Based Systems
Data Analysis in the 21st Century
Statistical Analysis and Data Mining
Statistical mining of interesting association rules
Statistics and Computing
Efficient Discovery of Statistically Significant Association Rules
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
An efficient rigorous approach for identifying statistically significant frequent itemsets
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Interestingness of Association Rules Using Symmetrical Tau and Logistic Regression
AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Decision support for improved service effectiveness using domain aware text mining
Knowledge-Based Systems
Context Based Positive and Negative Spatio-Temporal Association Rule Mining
Knowledge-Based Systems
Closeness Preference - A new interestingness measure for sequential rules mining
Knowledge-Based Systems
Confirmation measures of association rule interestingness
Knowledge-Based Systems
BruteSuppression: a size reduction method for Apriori rule sets
Journal of Intelligent Information Systems
Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation
Fundamenta Informaticae
Hi-index | 0.00 |
Assessing rules with interestingness measures is the pillar of successful application of association rules discovery. However, association rules discovered are normally large in number, some of which are not considered as interesting or significant for the application at hand. In this paper, we present a systematic approach to ascertain the discovered rules, and provide a precise statistical approach supporting this framework. The proposed strategy combines data mining and statistical measurement techniques, including redundancy analysis, sampling and multivariate statistical analysis, to discard the non- significant rules. Moreover, we consider real world datasets which are characterized by the uniform and non-uniform data/items distribution with a mixture of measurement levels throughout the data/items. The proposed unified framework is applied on these datasets to demonstrate its effectiveness in discarding many of the redundant or non-significant rules, while still preserving the high accuracy of the rule set as a whole.