Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
Discovering Frequent Closed Itemsets for Association Rules
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast vertical mining using diffsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Interestingness measures for data mining: A survey
ACM Computing Surveys (CSUR)
Discovering Significant Patterns
Machine Learning
An efficient rigorous approach for identifying statistically significant frequent itemsets
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
From Association Analysis to Causal Discovery
Proceedings of Workshop on Machine Learning for Sensory Data Analysis
Detection of daily living activities using a two-stage Markov model
Journal of Ambient Intelligence and Smart Environments - Intelligent agents in Ambient Intelligence and smart environments
Hi-index | 0.00 |
Association rule mining is an important problem in the data mining area. It enumerates and tests a large number of rules on a dataset and outputs rules that satisfy user-specified constraints. Due to the large number of rules being tested, rules that do not represent real systematic effect in the data can satisfy the given constraints purely by random chance. Hence association rule mining often suffers from a high risk of false positive errors. There is a lack of comprehensive study on controlling false positives in association rule mining. In this paper, we adopt three multiple testing correction approaches---the direct adjustment approach, the permutation-based approach and the holdout approach---to control false positives in association rule mining, and conduct extensive experiments to study their performance. Our results show that (1) Numerous spurious rules are generated if no correction is made. (2) The three approaches can control false positives effectively. Among the three approaches, the permutation-based approach has the highest power of detecting real association rules, but it is very computationally expensive. We employ several techniques to reduce its cost effectively.