Maintaining frequent closed itemsets over a sliding window

  • Authors:
  • James Cheng;Yiping Ke;Wilfred Ng

  • Affiliations:
  • Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong;Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong;Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the incremental update of Frequent Closed Itemsets (FCIs) over a sliding window in a high-speed data stream. We propose the notion of semi-FCIs, which is to progressively increase the minimum support threshold for an itemset as it is retained longer in the window, thereby drastically reducing the number of itemsets that need to be maintained and processed. We explore the properties of semi-FCIs and observe that a majority of the subsets of a semi-FCI are not semi-FCIs and need not be updated. This finding allows us to devise an efficient algorithm, IncMine, that incrementally updates the set of semi-FCIs over a sliding window. We also develop an inverted index to facilitate the update process. Our empirical results show that IncMine achieves significantly higher throughput and consumes less memory than the state-of-the-art streaming algorithms for mining FCIs and FIs. IncMine also attains high accuracy of 100% precision and over 93% recall.