Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A new framework for itemset generation
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Transversing itemset lattices with statistical metric pruning
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On detecting differences between groups
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Using Emerging Patterns and Decision Trees in Rare-Class Classification
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Engineering Applications of Artificial Intelligence
An algorithm for mining implicit itemset pairs based on differences of correlations
DS'05 Proceedings of the 8th international conference on Discovery Science
Hi-index | 0.01 |
Given a transaction database as a global set of transactions and its sub-database regarded as a local one, we consider a pair of itemsets whose degrees of correlations are higher in the local database than in the global one. If they show high correlation in the local database, they are detectable by some search methods of previous studies. On the other hand, there exist another kind of paired itemsets such that they are not regarded as characteristic and cannot be found by the methods of previous studies but that their degrees of correlations become drastically higher by the conditioning to the local database. We pay much attention to the latter kind of paired itemsets, as such pairs of itemsets can be an implicit and hidden evidence showing that something particular to the local database occurs even though they are not yet realized as characteristic ones. From this viewpoint, we measure paired itemsets by a difference of two correlations before and after the conditioning to the local database, and define a notion of DC pairs whose degrees of differences of correlations are high. As the measure is non-monotonic, we present an algorithm, searching for DC pairs, with some new pruning rules for cutting off hopeless itemsets. We show by an experimental result that potentially significant DC pairs can be actually found for a given database and the algorithm successfully detects such DC pairs.