Mining maximal frequent itemsets from data streams

  • Authors:
  • Guojun Mao; Xindong Wu; Xingquan Zhu; Gong Chen; Chunnian Liu

  • Affiliations:
  • Department of Computer Science, University of Vermont,Burlington VT 05405, USA;Department of Computer Science, University of Vermont,Burlington VT 05405, USA;Department of Computer Science, University of Vermont,Burlington VT 05405, USA;Department of Computer Science, University of Vermont,Burlington VT 05405, USA;School of Computer Science, Beijing University of Technology,Beijing 100022, P.R. China

  • Venue:
  • Journal of Information Science
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by scanning the stream data; and (2) re-mining the meta-patterns to finalize and output frequent patterns. The defectiveness of such a two-phase framework lies in the fact that the two stages provide barriers to dynamically and immediately finding frequent patterns with online functionalities. It is expected that a single-phase algorithm can fulfil frequent pattern mining from data streams in such a way that the users can see patterns in an immediate and dynamic manner, as soon as the patterns have become frequent. In this paper, we propose INSTANT, a single-phase algorithm for discovering frequent itemsets from data streams. The theoretical foundation of INSTANT is based on a framework theory on a set of itemsets, which is also presented in the paper. The novel design of INSTANT ensures that it employs compact data structures to mine frequent patterns from data streams in a single phase. Our experimental results demonstrate the time and space efficiency of the proposed algorithm.