C4.5: programs for machine learning
C4.5: programs for machine learning
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Explora: a multipattern and multistrategy discovery assistant
Advances in knowledge discovery and data mining
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Pruning and summarizing the discovered associations
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A statistical theory for quantitative association rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Multiple Comparisons in Induction Algorithms
Machine Learning
Generating non-redundant association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Empirical bayes screening for multi-item associations
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering associations with numeric variables
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Constraint-Based Rule Mining in Large, Dense Databases
Data Mining and Knowledge Discovery
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Finding the most interesting patterns in a database quickly by using sequential sampling
The Journal of Machine Learning Research
Interestingness of frequent itemsets using Bayesian networks as background knowledge
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On the discovery of significant statistical quantitative rules
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining and Knowledge Discovery
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding association rules that trade support optimally against confidence
Intelligent Data Analysis
OPUS: an efficient admissible algorithm for unordered search
Journal of Artificial Intelligence Research
Oversearching and layered search in empirical learning
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Semantic annotation of frequent patterns
ACM Transactions on Knowledge Discovery from Data (TKDD)
Assessing data mining results via swap randomization
ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
CSM-SD: Methodology for contrast set mining through subgroup discovery
Journal of Biomedical Informatics
Tell me something I don't know: randomization strategies for iterative data mining
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A lower bound on the sample size needed to perform a significant frequent pattern mining task
Pattern Recognition Letters
Cluster-grouping: from subgroup discovery to clustering
Machine Learning
ACM Transactions on Knowledge Discovery from Data (TKDD)
Interestingness of Association Rules Using Symmetrical Tau and Logistic Regression
AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
A statistical interestingness measures for XML based association rules
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Automatic requirement extraction from test cases
RV'10 Proceedings of the First international conference on Runtime verification
A self-training approach for resolving object coreference on the semantic web
Proceedings of the 20th international conference on World wide web
Multiple hypothesis testing in pattern discovery
DS'11 Proceedings of the 14th international conference on Discovery science
Controlling false positives in association rule mining
Proceedings of the VLDB Endowment
Robust discovery of local patterns: subsets and stratification in adverse drug reaction surveillance
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Significant motifs in time series
Statistical Analysis and Data Mining
Efficient Search Methods for Statistical Dependency Rules
Fundamenta Informaticae - Machine Learning in Bioinformatics
Cover similarity based item set mining
Bisociative Knowledge Discovery
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Probabilistic generalization of formal concepts
Programming and Computing Software
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A bayesian approach for classification rule mining in quantitative databases
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A bayesian scoring technique for mining predictive and non-spurious rules
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Discovering associations in high-dimensional data
ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Analysis of traffic accident severity using Decision Rules via Decision Trees
Expert Systems with Applications: An International Journal
Speeding up correlation search for binary data
Pattern Recognition Letters
Formal and computational properties of the confidence boost of association rules
ACM Transactions on Knowledge Discovery from Data (TKDD)
A statistical significance testing approach to mining the most informative set of patterns
Data Mining and Knowledge Discovery
Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation
Fundamenta Informaticae
Interestingness measures for association rules within groups
Intelligent Data Analysis
Discovering episodes with compact minimal windows
Data Mining and Knowledge Discovery
Compass: A hybrid method for clinical and biobank data mining
Journal of Biomedical Informatics
Hi-index | 0.00 |
Pattern discovery techniques, such as association rule discovery, explore large search spaces of potential patterns to find those that satisfy some user-specified constraints. Due to the large number of patterns considered, they suffer from an extreme risk of type-1 error, that is, of finding patterns that appear due to chance alone to satisfy the constraints on the sample data. This paper proposes techniques to overcome this problem by applying well-established statistical practices. These allow the user to enforce a strict upper limit on the risk of experimentwise error. Empirical studies demonstrate that standard pattern discovery techniques can discover numerous spurious patterns when applied to random data and when applied to real-world data result in large numbers of patterns that are rejected when subjected to sound statistical evaluation. They also reveal that a number of pragmatic choices about how such tests are performed can greatly affect their power.