GraSeq: A Novel Approximate Mining Approach of Sequential Patterns over Data Stream

  • Authors:
  • Haifeng Li;Hong Chen

  • Affiliations:
  • School of Information, Renmin University, Beijing, 100872, P.R. China;School of Information, Renmin University, Beijing, 100872, P.R. China

  • Venue:
  • ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequential patterns mining is an important data mining approach with broad applications. Traditional mining algorithms on database were not adapted to data stream. Recently, some approximate sequential pattern mining algorithms over data stream were presented which solved some problems except the one of wasting too many system resources in processing long sequences. According to observation and proof, a novel approximate sequential pattern mining algorithm is proposed named GraSeq. GraSequses directed weighted graph structure and stores the synopsis of sequences with only one scan of data stream; furthermore, a subsequences matching method is mentioned to reduce the cost of long sequences' processing and a conception validnodeis introduced to improve the accuracy of mining results. Our experimental results demonstrate that this algorithm is effective and efficient.