Mining long high utility itemsets in transaction databases

  • Authors:
  • Guangzhu Yu;Shihuang Shao;Xianhui Zeng

  • Affiliations:
  • Information and Technology College, DongHua University, Shanghai, China;Information and Technology College, DongHua University, Shanghai, China;Information and Technology College, DongHua University, Shanghai, China

  • Venue:
  • WSEAS Transactions on Information Science and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing algorithms for utility mining are column enumeration based, adopt an Apriori-like candidate set generation-and-test approach, and thus are inadequate on datasets with high dimensions or long patterns. To solve the problem, this paper proposes a hybrid model and a row enumeration based algorithm, i.e., inter-transaction, to discover high utility itemsets from two directions: existing algorithms such as UMining [1] can be used to seek short high utility itemsets from the bottom, while inter-transaction seeks long high utility itemsets from the top. By intersecting relevant transactions, the new algorithm can identify long high utility itemsets directly, without extending short itemsets step by step. In addition, new pruning strategies are used to cut down search space; optimization technique is adopted to improve the performance of the intersection of transactions. Experiments on synthetic data show that our method achieves high performance, especially in large high dimensional datasets.