Mining Recent Frequent Itemsets in Data Streams

  • Authors:
  • Kun Li;Yong-yan Wang;Manzoor Ellahi;Hong-an Wang

  • Affiliations:
  • -;-;-;-

  • Venue:
  • FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 04
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent itemsets in data streams is a hot research topic in recent years. Due to the continuous, high-speed and unbounded properties of data streams, traditional algorithms on static dataset are not suitable for mining in data streams. In this paper we present Bounded Frequent Itemsets stream (abbreviated as BFI-stream) algorithm, which uses a prefix-tree based structure, called BFI-tree, to maintain all accurate frequent itemsets from sliding windows over data streams. By monitoring the boundary between frequent itemsets and infrequent itemsets, it restricts the update process on a small part of the tree. Mining all frequent itemsets with accurate frequencies is just to traverse the tree. It is time efficient even when the user-specified minimum support threshold is small. Experiments compare the time and space usage with MFI-TransSW, which also returns all accurate frequent itemsets from sliding windows. The results show that BFI-stream outperforms MFI-TransSW in both time and space at most time especially when the minimum support is small.