A new efficient formulation for frequent item-set generation

Authors:
Ketan D. Shah;Sunita Mahajan
Affiliations:
MPSTME, SVKM's NMIMS University, Vile-Parle (west), Mumbai;M.E.T., Bandra (west), Mumbai
Venue:
Proceedings of the International Conference on Advances in Computing, Communication and Control
Year:
2009

Citing 4
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper a new efficient formulation is proposed for the generation of Frequent Item-sets that are used for computing association rules. It addresses the shortcoming of the serial Apriori Algorithm and also its parallel formulation based on count distribution (CD). It improvises on the time taken during every pass in the CD algorithm and helps improving the scalability. The scalability and efficiency of serial Apriori and also the CD algorithm is analyzed with respect to the proposed formulation. As the database size increases the efficiency of the CD algorithm decreases due to the large number of scans through the entire database. The approach addresses this problem and suggests improvement so that better scalability and efficiency are achieved. The experiments conducted show that the proposed formulation scales linearly with the number of transactions. The algorithm also has excellent scale-up properties with respect to the transaction size and the number of items in the dataset.