A new efficient formulation for frequent item-set generation

  • Authors:
  • Ketan D. Shah;Sunita Mahajan

  • Affiliations:
  • MPSTME, SVKM's NMIMS University, Vile-Parle (west), Mumbai;M.E.T., Bandra (west), Mumbai

  • Venue:
  • Proceedings of the International Conference on Advances in Computing, Communication and Control
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a new efficient formulation is proposed for the generation of Frequent Item-sets that are used for computing association rules. It addresses the shortcoming of the serial Apriori Algorithm and also its parallel formulation based on count distribution (CD). It improvises on the time taken during every pass in the CD algorithm and helps improving the scalability. The scalability and efficiency of serial Apriori and also the CD algorithm is analyzed with respect to the proposed formulation. As the database size increases the efficiency of the CD algorithm decreases due to the large number of scans through the entire database. The approach addresses this problem and suggests improvement so that better scalability and efficiency are achieved. The experiments conducted show that the proposed formulation scales linearly with the number of transactions. The algorithm also has excellent scale-up properties with respect to the transaction size and the number of items in the dataset.