Mining frequent itemsets in large databases: The hierarchical partitioning approach

  • Authors:
  • Fan-Chen Tseng

  • Affiliations:
  • Department of Multimedia and M-Commerce, Kainan University, No. 1, Kainan Road, Luzhu, Taoyuan County 33857, Taiwan, ROC

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 12.05

Visualization

Abstract

Although many methods have been proposed to enhance the efficiencies of data mining, little research has been devoted to the issue of scalability - that is, the problem of mining frequent itemsets when the size of the database is very large. This study proposes a methodology, hierarchical partitioning, for mining frequent itemsets in large databases, based on a novel data structure called the Frequent Pattern List (FPL). One of the major features of the FPL is its ability to partition the database, and thus transform the database into a set of sub-databases of manageable sizes. As a result, a divide-and-conquer approach can be developed to perform the desired data-mining tasks. Experimental results show that hierarchical partitioning is capable of mining frequent itemsets and frequent closed itemsets in very large databases.