Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Finding interesting rules from large sets of discovered association rules
CIKM '94 Proceedings of the third international conference on Information and knowledge management
Efficient parallel data mining for association rules
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data mining, hypergraph transversals, and machine learning (extended abstract)
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
New Algorithms for Fast Discovery of Association Rules
New Algorithms for Fast Discovery of Association Rules
Algorithms for mining association rules in bag databases
Information Sciences—Informatics and Computer Science: An International Journal
An efficiently algorithm based on itemsets-lattice and bitmap index for finding frequent itemsets
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Hi-index | 0.00 |
Most algorithms for association rule mining are variants of the basic Apriori algorithm (Agarwal and Srikant, Fast algorithms for mining association rules in databases, in: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago, Chile, 1994, pp. 487-499). One characteristic of these Apriori-based algorithms is that candidate itemsets are generated in rounds, with the size of the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends on the size of the biggest frequent itemsets. In this paper, we devise a more general candidate set generation algorithm, LGen, which generates candidate itemsets of multiple sizes during each database scan. We present an algorithm FindLarge which uses LGen to find frequent itemsets. We show that, given a reasonable set of suggested frequent itemsets, FindLarge can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient to discover all the frequent itemsets irrespective of the size of the biggest ones.Two I/O-saving algorithms, namely DIC and Pincher-Search, are compared with FindLarge in a series of experiments. We discuss the conditions under which FindLarge significantly outperforms the others in terms of I/O efficiency.