Memory issues in frequent itemset mining

  • Authors:
  • Bart Goethals

  • Affiliations:
  • HIIT Basic Research Unit, University of Helsinki, Finland

  • Venue:
  • Proceedings of the 2004 ACM symposium on Applied computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

During the past decade, many algorithms have been proposed to solve the frequent itemset mining problem, i.e. find all sets of items that frequently occur together in a given database of transactions. Although very efficient techniques have been presented, they still suffer from the same problem. That is, they are all inherently dependent on the amount of main memory available. Moreover, if this amount is not enough, the presented techniques are simply not applicable anymore, or significantly need to pay in performance. In this paper, we give a rigorous comparison between current state of the art techniques and present a new and simple technique, based on sorting the transaction database, resulting in a sometimes more efficient algorithm for frequent itemset mining using less memory.