Mining frequent itemsets in data streams using the weighted sliding window model

  • Authors:
  • Pauray S. M. Tsai

  • Affiliations:
  • Department of Computer Science and Information Engineering, Minghsin University of Science and Technology, Taiwan

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

In recent years, data stream mining has become an important research topic. With the emergence of new applications, the data we process are not again static, but the continuous dynamic data stream. Examples include network traffic analysis, Web click stream mining, network intrusion detection, and on-line transaction analysis. In this paper, we propose a new framework for data stream mining, called the weighted sliding window model. The proposed model allows the user to specify the number of windows for mining, the size of a window, and the weight for each window. Thus users can specify a higher weight to a more significant data section, which will make the mining result closer to user's requirements. Based on the weighted sliding window model, we propose a single pass algorithm, called WSW, to efficiently discover all the frequent itemsets from data streams. By analyzing data characteristics, an improved algorithm, called WSW-Imp, is developed to further reduce the time of deciding whether a candidate itemset is frequent or not. Empirical results show that WSW-Imp outperforms WSW under the weighted sliding window model.