Improving the efficiency of FP tree construction using transactional patternbase

Authors:
Imran Ali; Ziauddin;Abdur Rashid;Fazal Masud Khan;Waqas Anwar
Affiliations:
CIIT Abbotabad;ICIT, Gomal University;ICIT, Gomal University;ICIT, Gomal University;CIIT Abbotabad
Venue:
Proceedings of the 8th International Conference on Frontiers of Information Technology
Year:
2010

Citing 6
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining frequent patterns in transaction databases has been a popular theme in data mining study. Common activities include finding patterns among the large set of data items in database transactions. The Apriori algorithm is a widely accepted method of generating frequent patterns. The algorithm requires many scans of the database and thus seriously tax resources. Some of the methods currently being used for improving the efficiency of the Apriori algorithm are hash-based itemset counting, transaction reduction, partitioning, sampling, dynamic itemset counting etc. Two main approaches for associations rule mining are: candidate set generation and test, and restricted test only. Both approaches use to scan massive database multiple times. In our study, we propose a transaction patternbase, constructed in first scan of database. Transactions with same pattern are added to the Patternbase as their frequency is increased. Thus subsequent scanning requires only scanning this compact dataset which increases efficiency of the respective methods. We have implemented this technique with FP Growth method. This technique outperforms the database approach in many situations and performs exceptionally well when the repetition of transaction patterns is higher. It can be used with any associations rule mining method.