Identifying Similar Subsequences in Data Streams

  • Authors:
  • Machiko Toyoda;Yasushi Sakurai;Toshikazu Ichikawa

  • Affiliations:
  • NTT Information Sharing Platform laboratories, NTT Corporation, Tokyo, Japan 180---8585;NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan 619---0237;NTT Information Sharing Platform laboratories, NTT Corporation, Tokyo, Japan 180---8585

  • Venue:
  • DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similarity search has been studied in a domain of time series data mining, and it is an important technique in stream mining. Since sampling rates of streams are frequently different, and their time period varies in practical situations, the method which deals with time warping such as Dynamic Time Warping (DTW) is suitable for measuring similarity. However, finding pairs of similar subsequences between co-evolving sequences is difficult due to increase of the complexity because DTW is a method for detecting sequences that are similar to a given query sequence.In this paper, we focus on the problem of finding pairs of similar subsequences and periodicity over data streams. We propose a method to detect similar subsequences in streaming fashion. Our approach for measuring similarity relies on a proposed scoring function that incrementally updates a score, which is suitable for data stream processing. We also present an efficient algorithm based on the scoring function. Our experiments on real and synthetic data demonstrate that our method detects the pairs of qualifying subsequence correctly and that it is dramatically faster than the existing method.