Max-FISM: Mining (recently) maximal frequent itemsets over data streams using the sliding window model

  • Authors:
  • Zahra Farzanyar;Mohammadreza Kangavari;Nick Cercone

  • Affiliations:
  • Department of Computer Engineering, Iran University of Science & Technology,Tehran, Iran;Department of Computer Engineering, Iran University of Science & Technology,Tehran, Iran;Department of Computer Science and Engineering, York University, Toronto, Canada

  • Venue:
  • Computers & Mathematics with Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.09

Visualization

Abstract

Frequent itemset mining from data streams is an important data mining problem with broad applications such as retail market data analysis, network monitoring, web usage mining, and stock market prediction. However, it is also a difficult problem due to the unbounded, high-speed and continuous characteristics of streaming data. Therefore, extracting frequent itemsets from more recent data can enhance the analysis of stream data. In this paper, we propose an efficient algorithm, called Max-FISM (Maximal-Frequent Itemsets Mining), for mining recent maximal frequent itemsets from a high-speed stream of transactions within a sliding window. According to our algorithm, whenever a new transaction is inserted in the current window only its maximum itemset should be inserted into a prefix tree-based summary data structure called Max-Set for maintaining the number of independent appearance of each transaction in the current window. Finally, the set of recent maximal frequent itemsets is obtained from the current Max-Set. Experimental studies show that the proposed Max-FISM algorithm is highly efficient in terms of memory and time complexity for mining recent maximal frequent itemsets over high-speed data streams.