Efficient mining of high utility itemsets from large datasets

  • Authors:
  • Alva Erwin;Raj P. Gopalan;N. R. Achuthan

  • Affiliations:
  • Department of Computing, Curtin University of Technology, Bentley, Western Australia;Department of Computing, Curtin University of Technology, Bentley, Western Australia;Department of Mathematics and Statistics, Curtin University of Technology, Bentley, Western Australia

  • Venue:
  • PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

High utility itemsets mining extends frequent pattern mining to discover itemsets in a transaction database with utility values above a given threshold. However, mining high utility itemsets presents a greater challenge than frequent itemset mining, since high utility itemsets lack the anti-monotone property of frequent itemsets. Transaction Weighted Utility (TWU) proposed recently by researchers has anti-monotone property, but it is an overestimate of itemset utility and therefore leads to a larger search space. We propose an algorithm that uses TWU with pattern growth based on a compact utility pattern tree data structure. Our algorithm implements a parallel projection scheme to use disk storage when the main memory is inadequate for dealing with large datasets. Experimental evaluation shows that our algorithm is more efficient compared to previous algorithms and can mine larger datasets of both dense and sparse data containing long patterns.