Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits

  • Authors:
  • Hua-Fu Li;Hsin-Yun Huang;Suh-Yin Lee

  • Affiliations:
  • Kainan University, Department of Information Management, Taoyuan, Taiwan;National Chiao-Tung University, Department of Computer Science, Hsinchu, Taiwan;National Chiao-Tung University, Department of Computer Science, Hsinchu, Taiwan

  • Venue:
  • Knowledge and Information Systems - Special Issue on Data Warehousing and Knowledge Discovery from Sensors and Streams
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining utility itemsets from data steams is one of the most interesting research issues in data mining and knowledge discovery. In this paper, two efficient sliding window-based algorithms, MHUI-BIT (Mining High-Utility Itemsets based on BITvector) and MHUI-TID (Mining High-Utility Itemsets based on TIDlist), are proposed for mining high-utility itemsets from data streams. Based on the sliding window-based framework of the proposed approaches, two effective representations of item information, Bitvector and TIDlist, and a lexicographical tree-based summary data structure, LexTree-2HTU, are developed to improve the efficiency of discovering high-utility itemsets with positive profits from data streams. Experimental results show that the proposed algorithms outperform than the existing approaches for discovering high-utility itemsets from data streams over sliding windows. Beside, we also propose the adapted approaches of algorithms MHUI-BIT and MHUI-TID in order to handle the case when we are interested in mining utility itemsets with negative item profits. Experiments show that the variants of algorithms MHUI-BIT and MHUI-TID are efficient approaches for mining high-utility itemsets with negative item profits over stream transaction-sensitive sliding windows.