Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integrating association rule mining with relational database systems: alternatives and implications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining association rules with multiple minimum supports
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A New SQL-like Operator for Mining Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Constraint-Based Rule Mining in Large, Dense Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
This paper presents new techniques for focusing the discovery of frequent itemsets within large, dense datasets containing highly frequent items. The existence of highly frequent items adds significantly to the cost of computing the complete set of frequent itemsets. Our approach allows for the exclusion of such items during the candidate generation phase of the Apriori algorithm. Afterwards, the highly frequent items can be reintroduced, via an inferencing framework, providing for a capability to generate frequent itemsets without counting their frequency. We demonstrate the use of these new techniques within the well-studied framework of the Apriori algorithm. Furthermore, we provide empirical results using our techniques on both synthetic and real datasets - both relevant since the real datasets exhibit statistical characteristics different from the probabilistic assumptions behind the synthetic data. The source we used for real data was the U.S. Census.