Variable support mining of frequent itemsets over data streams using synopsis vectors

  • Authors:
  • Ming-Yen Lin;Sue-Chen Hsueh;Sheng-Kun Hwang

  • Affiliations:
  • Department of Information Engineering and Computer Science, Feng-Chia University, Taiwan;Department of Information Management, Chaoyang University of Technology, Taiwan;Department of Information Engineering and Computer Science, Feng-Chia University, Taiwan

  • Venue:
  • PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent itemsets over data streams is an emergent research topic in recent years. Previous approaches generally use a fixed support threshold to discover the patterns in the stream. However, the threshold will be changed to cope with the needs of the users and the characteristics of the incoming data in reality. Changing the threshold implies a re-mining of the whole transactions in a non-streaming environment. Nevertheless, the "look-once" feature of the streaming data cannot provide the discarded transactions so that a re-mining on the stream is impossible. Therefore, we propose a method for variable support mining of frequent itemsets over the data stream. A synopsis vector is constructed for maintaining statistics of past transactions and is invoked only when necessary. The conducted experimental results show that our approach is efficient and scalable for variable support mining in data streams.