Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Making use of the most expressive jumping emerging patterns for classification
Knowledge and Information Systems
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules
Data Mining and Knowledge Discovery
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
Alternative Interest Measures for Mining Associations in Databases
IEEE Transactions on Knowledge and Data Engineering
Mining border descriptions of emerging patterns from dataset pairs
Knowledge and Information Systems
LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining
Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Mining quantitative correlated patterns using an information-theoretic approach
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
An Efficient Branch-and-bound Algorithm for Finding a Maximum Clique with Computational Experiments
Journal of Global Optimization
Correlation search in graph databases
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining non-redundant high order correlations in binary data
Proceedings of the VLDB Endowment
Efficient Discovery of Top-K Minimal Jumping Emerging Patterns
RSCTC '08 Proceedings of the 6th International Conference on Rough Sets and Current Trends in Computing
The Journal of Machine Learning Research
Engineering Applications of Artificial Intelligence
DS'10 Proceedings of the 13th international conference on Discovery science
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Contrasting correlations by an efficient double-clique condition
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Hi-index | 0.00 |
Given a family of transaction databases, various data mining methods for extracting patterns distinguishing one database from another have been extensively studied. This paper particularly focuses on a problem of finding patterns that are more uncorrelated in one database, called a base, and begin to be correlated to some extent in another database, called a target. The detected patterns are not highly correlated at the target. In spite of less correlatedness at the target, the detected patterns are regarded as indicative based on a fact that they are uncorrelated in the base. We design our search procedure for those patterns by applying optimization strategy under some constraints. More precisely, the objective is to minimize the correlation of patterns at the base under the constraint using upper bound of correlations at the target and the lower bound for the correlation changes over two databases. As there exist many potential solutions, we apply top N control that attains the bottom N correlation values at the base for all the patterns satisfying the constraint. As we measure the degree of correlation by k-way mutual information, that is monotonically increasing with respect to item addition, we can design a dynamic pruning method for disregarding useless items under the top N control. This contributes for much reducing the computational cost, in whole search process, needed to calculate correlation values over several items as random variables. As a result, we can present a complete search procedure producing only top N solution patterns from a set of all patterns satisfying the constraint, and show its effectiveness and efficiency through experiments.