Sequential pattern mining from stream data

  • Authors:
  • Adam Koper;Hung Son Nguyen

  • Affiliations:
  • Institute of Mathematics, The University of Warsaw, Warsaw, Poland;Institute of Mathematics, The University of Warsaw, Warsaw, Poland

  • Venue:
  • ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequential Pattern Mining, briefly SPM, is an interesting issue in Data Mining that can be applied for temporal or time series data. This paper is related to SPM algorithms that can work with stream data. We present three new stream SPM methods, called SS-BE2, SS-LC and SS-LC2, which are the extensions of SS-BE. The proposed methods, similarly to SS-BE, are dealing with fixed-sized batches using PrefixSpan algorithm, and the critical problem in each step is how to store the huge amount of candidate patterns, and how to select the frequent patterns properly. The main idea of based on improving the tree pruning method of the original SS-BE to guarantee the high completeness and correctness of the result. In all experiments performed on benchmark data, the proposed solutions outperform the original SS-BE algorithm. Moreover, the proposed algorithms seems to be scalable, as the usage of memory is linearly depended on the number of patterns, and the size of the buffer.