Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of association rules using closed itemset lattices
Information Systems
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Communications of the ACM
Mining frequent patterns with counting inference
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
The Space of Jumping Emerging Patterns and Its Incremental Maintenance Algorithms
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The Closed Keys Base of Frequent Itemsets
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Carpenter: finding closed patterns in long biological datasets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Selecting the right objective measure for association analysis
Information Systems - Knowledge discovery and data mining (KDD 2002)
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining statistically important equivalence classes and delta-discriminative emerging patterns
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Negative Generator Border for Effective Pattern Maintenance
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Efficient discovery of risk patterns in medical data
Artificial Intelligence in Medicine
About the lossless reduction of the minimal generator family of a context
ICFCA'07 Proceedings of the 5th international conference on Formal concept analysis
Evolution and maintenance of frequent pattern space when transactions are removed
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Succinct system of minimal generators: a thorough study, limitations and new definitions
CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
Adverse drug reaction mining in pharmacovigilance data using formal concept analysis
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Mining monolingual and bilingual corpora
Intelligent Data Analysis
Efficiently finding the best parameter for the emerging pattern-based classifier PCL
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Clustering and understanding documents via discrimination information maximization
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Key roles of closed sets and minimal generators in concise representations of frequent patterns
Intelligent Data Analysis
Hi-index | 0.00 |
We are often interested to test whether a given cause has a given effect. If we cannot specify the nature of the factors involved, such tests are called model-free studies. There are two major strategies to demonstrate associations between risk factors (ie. patterns) and outcome phenotypes (ie. class labels). The first is that of prospective study designs, and the analysis is based on the concept of "relative risk": What fraction of the exposed (ie. has the pattern) or unexposed (ie. lacks the pattern) individuals have the phenotype (ie. the class label)? The second is that of retrospective designs, and the analysis is based on the concept of "odds ratio": The odds that a case has been exposed to a risk factor is compared to the odds for a case that has not been exposed. The efficient extraction of patterns that have good relative risk and/or odds ratio has not been previously studied in the data mining context. In this paper, we investigate such patterns. We show that this pattern space can be systematically stratified into plateaus of convex spaces based on their support levels. Exploiting convexity, we formulate a number of sound and complete algorithms to extract the most general and the most specific of such patterns at each support level. We compare these algorithms. We further demonstrate that the most efficient among these algorithms is able to mine these sophisticated patterns at a speed comparable to that of mining frequent closed patterns, which are patterns that satisfy considerably simpler conditions.