Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set

Authors:
D. -I. Lin;Z. M. Kedem
Affiliations:
-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2002

Citing 24
Cited 20

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast sequential and parallel algorithms for association rule mining: a comparison

Fast sequential and parallel algorithms for association rule mining: a comparison
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Data mining, hypergraph transversals, and machine learning (extended abstract)

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Knowledge Discovery in Databases

Knowledge Discovery in Databases
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Knowledge Discovery from Telecommunication Network Alarm Databases

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Discovering All Most Specific Sentences by Randomized Algorithms

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Cyclic Association Rules

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Parallel Data Mining for Association Rules on Shared-Memory Multiprocessors

Parallel Data Mining for Association Rules on Shared-Memory Multiprocessors

The complexity of mining maximal frequent itemsets and maximal frequent patterns

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters

IEEE Transactions on Parallel and Distributed Systems
A fuzzy data mining algorithm for incremental mining of quantitative sequential patterns

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Computational aspects of mining maximal frequent patterns

Theoretical Computer Science
Association rules mining using heavy itemsets

Data & Knowledge Engineering
Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan

Expert Systems with Applications: An International Journal
Association rule mining through the ant colony system for National Health Insurance Research Database in Taiwan

Computers & Mathematics with Applications
Discovering frequent itemsets by support approximation and itemset clustering

Data & Knowledge Engineering
Using back-propagation to learn association rules for service personalization

Expert Systems with Applications: An International Journal
Towards personalized recommendation by two-step modified Apriori data mining algorithm

Expert Systems with Applications: An International Journal
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations

Knowledge and Information Systems
A framework for mining top-k frequent closed itemsets using order preserving generators

Proceedings of the 2nd Bangalore Annual Compute Conference
Image mining using association rules derived from feature matrix

Proceedings of the International Conference on Advances in Computing, Communication and Control
Capturing truthiness: mining truth tables in binary datasets

Proceedings of the 2009 ACM symposium on Applied Computing
Mining disjunctive consequent association rules

Applied Soft Computing
An improved association rules mining method

Expert Systems with Applications: An International Journal
A tree structure for event-based sequence mining

Knowledge-Based Systems
Collusion-Free Privacy Preserving Data Mining

International Journal of Intelligent Information Technologies
Mining numerical association rules via multi-objective genetic algorithms

Information Sciences: an International Journal
An efficient method for mining frequent itemsets with double constraints

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up, breadth-first search direction. The computation starts from frequent 1-itemsets (the minimum length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform well when all maximal frequent itemsets are short. However, performance drastically deteriorates when some of the maximal frequent itemsets are long. We present a new algorithm which combines both the bottom-up and the top-down searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure, the maximum frequent candidate set. It is used to prune early candidates that would be normally encountered in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicit examination of every frequent itemset. Therefore, the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, thus specifying immediately all frequent itemsets. We evaluate the performance of the algorithm using well-known synthetic benchmark databases, real-life census, and stock market databases. The improvement in performance can be up to several orders of magnitude, compared to the best previous algorithms.