Mining long high utility itemsets in transaction databases

  • Authors:
  • Guangzhu Yu;Shihuang Shao;Daoqing Sun;Bin Luo

  • Affiliations:
  • Information and Technology College, DongHua University, ShangHai, China;Information and Technology College, DongHua University, ShangHai, China;Information and Technology College, DongHua University, ShangHai, China;Automation College, Guangdong University of Technology, Guangzhou, China

  • Venue:
  • SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although support has been used as a fundamental measure to determine the statistical importance of an itemset, it can't express other richer information such as quantity sold, unit profit, or other numerical attributes. To overcome the shortcoming, utility is used to measure the semantic importance and several algorithms for utility mining have been proposed. However, existing algorithms for utility mining adopt an Apriori-like candidate set generation-and-test approach and are inadequate on databases with long patterns. To solve the problem, this paper proposes a hybrid model and a novel algorithm, i.e., inter-transaction, to discover high utility itemsets from two directions: existing algorithms such as UMining [1] seeks short high utility itemsets from bottom, while inter-transaction seeks long high utility itemsets from top. To avoid the costly process of extending short itemsets step by step, inter-transaction find long itemsets directly by intersecting relevant transactions. Experiments on synthetic data show that the new algorithm achieves high performance, especially in high dimension data set.