Using a Hash-Based Method with Transaction Trimming for Mining Association Rules

Authors:
Jong Soo Park;Ming-Syan Chen;Philip S. Yu
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
1997

Citing 14
Cited 110

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Combinatorial pattern discovery for scientific data: some preliminary results

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
File structures using hashing functions

Communications of the ACM
Induction of Decision Trees

Machine Learning
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Knowledge Mining by Imprecise Querying: A Classification-Based Approach

Proceedings of the Eighth International Conference on Data Engineering
Set-Oriented Mining for Association Rules in Relational Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
An Interval Classifier for Database Mining Applications

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Knowledge Discovery in Databases: An Attribute-Oriented Approach

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases

A new framework for itemset generation

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient mining of association rules in text databases

Proceedings of the eighth international conference on Information and knowledge management
The application of association rule mining to remotely sensed data

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 1
The segment support map: scalable mining of frequent itemsets

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Sliding-window filtering: an efficient algorithm for incremental mining

Proceedings of the tenth international conference on Information and knowledge management
Rapid association rule mining

Proceedings of the tenth international conference on Information and knowledge management
Learning and decision-making in the framework of fuzzy lattices

New learning paradigms in soft computing
Exploiting succinct constraints using FP-trees

ACM SIGKDD Explorations Newsletter
Mining association rules using inverted hashing and pruning

Information Processing Letters
Efficient Data Mining for Path Traversal Patterns

IEEE Transactions on Knowledge and Data Engineering
Mining Associations with the Collective Strength Approach

IEEE Transactions on Knowledge and Data Engineering
Developing Data Allocation Schemes by Incremental Mining of User Moving Patterns in a Mobile Computing System

IEEE Transactions on Knowledge and Data Engineering
Finding Generalized Path Patterns for Web Log Data Mining

ADBIS-DASFAA '00 Proceedings of the East-European Conference on Advances in Databases and Information Systems Held Jointly with International Conference on Database Systems for Advanced Applications: Current Issues in Databases and Information Systems
Mining of Association Rules in Text Databases Using Inverted Hashing and Pruning

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Data Mining Techniques for Associations, Clustering and Classification

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Mining Web Transaction Patterns in an Electronic Commerce Environment

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Generating Frequent Patterns with the Frequent Pattern List

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
SETM*-MaxK: An Efficient SET-Based Approach to Find the Largest Itemset

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Data Mining and Personalization Technologies

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
An Effective Boolean Algorithm for Mining Association Rules in Large Databases

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Discovering Spatial Co-location Patterns: A Summary of Results

SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Efficient similarity search for market basket data

The VLDB Journal — The International Journal on Very Large Data Bases
Distributed data mining in a chain store database of short transactions

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CrystalBall: a framework for mining variants of association rules

ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
Capturing User Access Patterns in the Web for Data Mining

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Discovering associations in very large databases by approximating

Acta Cybernetica
Efficient dynamic mining of constrained frequent sets

ACM Transactions on Database Systems (TODS)
ART: A Hybrid Classification Model

Machine Learning
Efficient data mining for calling path patterns in GSM networks

Information Systems
Memory-adative association rules mining

Information Systems - Databases: Creation, management and utilization
Incremental update on sequential patterns in large databases by implicit merging and efficient counting

Information Systems - Databases: Creation, management and utilization
A Support-Ordered Trie for Fast Frequent Itemset Discovery

IEEE Transactions on Knowledge and Data Engineering
Efficient mining of both positive and negative association rules

ACM Transactions on Information Systems (TOIS)
Predicting Source Code Changes by Mining Change History

IEEE Transactions on Software Engineering
A fuzzy logic based method to acquire user threshold of minimum-support for mining association rules

Information Sciences—Informatics and Computer Science: An International Journal
Mining interesting association rules from customer databases and transaction databases

Information Systems
Sliding window filtering: an efficient method for incremental mining on a time-variant database.

Information Systems
Fast Algorithms for Frequent Itemset Mining Using FP-Trees

IEEE Transactions on Knowledge and Data Engineering
CanTree: A Tree Structure for Efficient Incremental Mining of Frequent Patterns

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Association mining

ACM Computing Surveys (CSUR)
Perfect hashing schemes for mining traversal patterns

Fundamenta Informaticae
Market basket analysis in a multiple store environment

Decision Support Systems
Applying frequent itemset mining to identify a small itemset that satisfies a large percentage of orders in a warehouse

Computers and Operations Research
Distributed Mining of Constrained Patterns from Wireless Sensor Data

WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Association rules mining using heavy itemsets

Data & Knowledge Engineering
Parallel mining of association rules from text databases

The Journal of Supercomputing
A data mining approach for retail knowledge discovery with consideration of the effect of shelf-space adjacency on sales

Decision Support Systems
CanTree: a canonical-order tree for incremental frequent-pattern mining

Knowledge and Information Systems
BitTableFI: An efficient mining frequent itemsets algorithm

Knowledge-Based Systems
Mining association rules in very large clustered domains

Information Systems
Association mining in time-varying domains

Intelligent Data Analysis
Learning quantifiable associations via principal sparse non-negative matrix factorization

Intelligent Data Analysis
Twain: Two-end association miner with precise frequent exhibition periods

ACM Transactions on Knowledge Discovery from Data (TKDD)
A new incremental data mining algorithm using pre-large itemsets

Intelligent Data Analysis
Web usage mining with intentional browsing data

Expert Systems with Applications: An International Journal
A novel approach for discovering retail knowledge with price information from transaction databases

Expert Systems with Applications: An International Journal
Incrementally fast updated frequent pattern trees

Expert Systems with Applications: An International Journal
Maintenance of generalized association rules for record deletion based on the pre-large concept

AIKED'07 Proceedings of the 6th Conference on 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases - Volume 6
A novel Network Intrusion Detection System (NIDS) based on signatures search of data mining

Proceedings of the 1st international conference on Forensic applications and techniques in telecommunications, information, and multimedia and workshop
An efficient incremental mining algorithm-QSD

Intelligent Data Analysis
Removing biases in unsupervised learning of sequential patterns

Intelligent Data Analysis
An efficient algorithm for mining temporal high utility itemsets from data streams

Journal of Systems and Software
Efficient mining of salinity and temperature association rules from ARGO data

Expert Systems with Applications: An International Journal
Efficient algorithms for incremental Web log mining with dynamic thresholds

The VLDB Journal — The International Journal on Very Large Data Bases
Mining top-k frequent patterns in the presence of the memory constraint

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations

Knowledge and Information Systems
Incremental Mining with Prelarge Trees

IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
RETRACTED: Efficient mining of temporal emerging itemsets from data streams

Expert Systems with Applications: An International Journal
Sequential association rules for forecasting failure patterns of aircrafts in Korean airforce

Expert Systems with Applications: An International Journal
Applying hybrid data mining techniques to web-based self-assessment system of Study and Learning Strategies Inventory

Expert Systems with Applications: An International Journal
The Pre-FUFP algorithm for incremental mining

Expert Systems with Applications: An International Journal
Maintenance of fast updated frequent pattern trees for record deletion

Computational Statistics & Data Analysis
Mining frequent patterns in image databases with 9D-SPA representation

Journal of Systems and Software
An effective mining approach for up-to-date patterns

Expert Systems with Applications: An International Journal
An efficient and effective association-rule maintenance algorithm for record modification

Expert Systems with Applications: An International Journal
A simplicial complex, a hypergraph, structure in the latent semantic space of document clustering

International Journal of Approximate Reasoning
Mining association rules with multiple minimum supports using maximum constraints

International Journal of Approximate Reasoning
On utilizing association and interaction concepts for enhancing microaggregation in secure statistical databases

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Enhancing SWF for incremental association mining by itemset maintenance

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Progressive weighted miner: an efficient method for time-constraint mining

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Maintenance of fast updated frequent trees for record deletion based on prelarge concepts

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
A novel method for micro-aggregation in secure statistical databases using association and interaction

ICICS'07 Proceedings of the 9th international conference on Information and communications security
Association rule mining: models and algorithms

Association rule mining: models and algorithms
BISC: A bitmap itemset support counting approach for efficient frequent itemset mining

ACM Transactions on Knowledge Discovery from Data (TKDD)
The improvement of PHP algorithm for association rules

CAR'10 Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics - Volume 3
Mining disjunctive consequent association rules

Applied Soft Computing
An improved frequent pattern growth method for mining association rules

Expert Systems with Applications: An International Journal
An incremental mining algorithm for maintaining sequential patterns using pre-large sequences

Expert Systems with Applications: An International Journal
An efficient discovery of class-restricted MARs

AIKED'11 Proceedings of the 10th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
Distributed BitTable multi-agent association rules mining algorithm

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams

Journal of Information Science
A new mining approach for uncertain databases using CUFP trees

Expert Systems with Applications: An International Journal
Using reliable short rules to avoid unnecessary tests in decision trees

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Improved negative-border online mining approaches

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Mining frequent spatial patterns in image databases

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
On the chinese document clustering based on dynamical term clustering

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
A sampling-based method for mining frequent patterns from databases

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
A matrix algorithm for mining association rules

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
An incremental mining algorithm for high utility itemsets

Expert Systems with Applications: An International Journal
Flexible online association rule mining based on multidimensional pattern relations

Information Sciences: an International Journal
Integration of multiple fuzzy FP-trees

ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part I
An incremental mining algorithm for association rules based on minimal perfect hashing and pruning

APWeb'12 Proceedings of the 14th international conference on Web Technologies and Applications
Mop: An Efficient Algorithm for Mining Frequent Pattern with Subtree Traversing

Fundamenta Informaticae
Perfect Hashing Schemes for Mining Traversal Patterns

Fundamenta Informaticae
Mining Induced/Embedded Subtrees using the Level of Embedding Constraint

Fundamenta Informaticae
Improved counter based algorithms for frequent pairs mining in transactional data streams

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
A space-time trade off for FUFP-trees maintenance

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
An FPGA-Based Accelerator for Frequent Itemset Mining

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
An IP Traceback Protocol using a Compressed Hash Table, a Sinkhole Router and Data Mining based on Network Forensics against Network Attacks

Future Generation Computer Systems
Incrementally mining high utility patterns based on pre-large concept

Applied Intelligence

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper, we examine the issue of mining association rules among items in a large database of sales transactions. Mining association rules means that, given a database of sales transactions, to discover all associations among items such that the presence of some items in a transaction will imply the presence of other items in the same transaction. The mining of association rules can be mapped into the problem of discovering large itemsets where a large itemset is a group of items that appear in a sufficient number of transactions. The problem of discovering large itemsets can be solved by constructing a candidate set of itemsets first, and then, identifying驴within this candidate set驴those itemsets that meet the large itemset requirement. Generally, this is done iteratively for each large k-itemset in increasing order of k, where a large k-itemset is a large itemset with k items. To determine large itemsets from a huge number of candidate sets in early iterations is usually the dominating factor for the overall data-mining performance. To address this issue, we develop an effective algorithm for the candidate set generation. It is a hash-based algorithm and is especially effective for the generation of candidate set for large 2-itemsets. Explicitly, the number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods驴thus resolving the performance bottleneck. Note that the generation of smaller candidate sets enables us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly. The advantage of the proposed algorithm also provides us the opportunity of reducing the amount of disk I/O required. Extensive simulation study is conducted to evaluate performance of the proposed algorithm.