Efficiently identifying exploratory rules’ significance

Authors:
Shiying Huang;Geoffrey I. Webb
Affiliations:
School of Computer Science and Software Engineering, Monash University, Melbourne, VIC, Australia;School of Computer Science and Software Engineering, Monash University, Melbourne, VIC, Australia
Venue:
Data Mining
Year:
2006

Citing 10
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Pruning and summarizing the discovered associations

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A statistical theory for quantitative association rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques

Data mining: concepts and techniques
Discovering associations with numeric variables

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Constraint-Based Rule Mining in Large, Dense Databases

Data Mining and Knowledge Discovery
Detecting Group Differences: Mining Contrast Sets

Data Mining and Knowledge Discovery
Mining Frequent Item Sets with Convertible Constraints

Proceedings of the 17th International Conference on Data Engineering
OPUS: an efficient admissible algorithm for unordered search

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

How to efficiently discard potentially uninteresting rules in exploratory rule discovery is one of the important research foci in data mining. Many researchers have presented algorithms to automatically remove potentially uninteresting rules utilizing background knowledge and user-specified constraints. Identifying the significance of exploratory rules using a significance test is desirable for removing rules that may appear interesting by chance, hence providing the users with a more compact set of resulting rules. However, applying statistical tests to identify significant rules requires considerable computation and data access in order to obtain the necessary statistics. The situation gets worse as the size of the database increases. In this paper, we propose two approaches for improving the efficiency of significant exploratory rule discovery. We also evaluate the experimental effect in impact rule discovery which is suitable for discovering exploratory rules in very large, dense databases.