LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining

  • Authors:
  • Takeaki Uno;Masashi Kiyomi;Hiroki Arimura

  • Affiliations:
  • National Institute of Informatics, Hitotsubashi, Chiyoda-ku, Tokyo, Japan;National Institute of Informatics, Hitotsubashi, Chiyoda-ku, Tokyo, Japan;Hokkaido University, Sapporo, Japan

  • Venue:
  • Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

For a transaction database, a frequent itemset is an itemset included in at least a specified number of transactions. To find all the frequent itemsets, the heaviest task is the computation of frequency of each candidate itemset. In the previous studies, there are roughly three data structures and algorithms for the computation: bitmap, prefix tree, and array lists. Each of these has its own advantage and disadvantage with respect to the density of the input database. In this paper, we propose an efficient way to combine these three data structures so that in any case the combination gives the best performance.