Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A new framework for itemset generation
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Transversing itemset lattices with statistical metric pruning
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On detecting differences between groups
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
An algorithm for mining implicit itemset pairs based on differences of correlations
DS'05 Proceedings of the 8th international conference on Discovery Science
Editorial: Recent advances in data mining
Engineering Applications of Artificial Intelligence
IR interface for contrasting multiple news sites
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Top-N minimization approach for indicative correlation change mining
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
Given a transaction database as a global set of transactions and its local database obtained by some conditioning of the global database, we consider pairs of itemsets whose degrees of correlation are higher in the local database than in the global one. A problem of finding paired itemsets with high correlation in one database is already known as discovery of correlation, and has been studied as the highly correlated itemsets are characteristic in the database. However, even noncharacteristic paired itemsets are also meaningful provided the degree of correlation increases significantly in the local database compared with the global one. They can be implicit and hidden evidences showing that something particular to the local database occurs, even though they were not previously realized to be characteristic. From this viewpoint, we have proposed measurement of the significance of paired itemsets by the difference of two correlations before and after the conditioning of the global database, and have defined a notion of DC pairs, whose degrees of difference of correlation are high. In this paper, we develop an algorithm for mining DC pairs and apply it to a transaction database with time stamp data. The problem of finding DC pairs for large databases is computationally hard in general, as the algorithm has to check even noncharacteristic paired itemsets. However, we show that our algorithm equipped with some pruning rules works successfully to find DC pairs that may be significant.