Sequential pattern mining from stream data

Authors:
Adam Koper;Hung Son Nguyen
Affiliations:
Institute of Mathematics, The University of Warsaw, Warsaw, Poland;Institute of Mathematics, The University of Warsaw, Warsaw, Poland
Venue:
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Year:
2011

Citing 7
Cited 0

Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
IncSpan: incremental mining of sequential patterns in large database

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach

IEEE Transactions on Knowledge and Data Engineering
PLWAP sequential mining: open source code

Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Debellor: A Data Mining Platform with Stream Architecture

Transactions on Rough Sets IX
Stream Sequential Pattern Mining with Precise Error Bounds

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential Pattern Mining, briefly SPM, is an interesting issue in Data Mining that can be applied for temporal or time series data. This paper is related to SPM algorithms that can work with stream data. We present three new stream SPM methods, called SS-BE2, SS-LC and SS-LC2, which are the extensions of SS-BE. The proposed methods, similarly to SS-BE, are dealing with fixed-sized batches using PrefixSpan algorithm, and the critical problem in each step is how to store the huge amount of candidate patterns, and how to select the frequent patterns properly. The main idea of based on improving the tree pruning method of the original SS-BE to guarantee the high completeness and correctness of the result. In all experiments performed on benchmark data, the proposed solutions outperform the original SS-BE algorithm. Moreover, the proposed algorithms seems to be scalable, as the usage of memory is linearly depended on the number of patterns, and the size of the buffer.