Efficient Discovery of Statistically Significant Association Rules

Authors:
Wilhelmiina Hämäläinen;Matti Nykänen
Affiliations:
-;-
Venue:
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Year:
2008

Citing 0
Cited 7

An efficient rigorous approach for identifying statistically significant frequent itemsets

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Lift-based search for significant dependencies in dense data sets

Proceedings of the KDD-09 Workshop on Statistical and Relational Learning in Bioinformatics
Interestingness measures for association rules based on statistical validity

Knowledge-Based Systems
Secure top-k subgroup discovery

PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Secure Distributed Subgroup Discovery in Horizontally Partitioned Data

Transactions on Data Privacy
An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets

Journal of the ACM (JACM)
Efficient Search Methods for Statistical Dependency Rules

Fundamenta Informaticae - Machine Learning in Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Searching statistically significant association rules is an important but neglected problem. Traditional association rules do not capture the idea of statistical dependence and the resulting rules can be spurious, while the most significant rules may be missing. This leads to erroneous models and predictions which often become expensive.The problem is computationally very difficult, because the significance is not a monotonic property. However, in this paper we prove several other properties, which can be used for pruning the search space. The properties are implemented in the StatApriori algorithm, which searches statistically significant, non-redundant association rules. Based on both theoretical and empirical observations, the resulting rules are very accurate compared to traditional association rules. In addition, StatApriori can work with extremely low frequencies, thus finding new interesting rules.