Efficient Similarity Search for Time Series Data Based on the Minimum Distance

  • Authors:
  • Sangjun Lee;Dongseop Kwon;Sukho Lee

  • Affiliations:
  • -;-;-

  • Venue:
  • CAiSE '02 Proceedings of the 14th International Conference on Advanced Information Systems Engineering
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of efficient similarity search based on the minimum distance in large time series databases. Most of previous work is focused on similarity matching and retrieval of time series based on the Euclidean distance. However, as we demonstrate in this paper, the Euclidean distance has limitations as a similarity measurement. It is sensitive to the absolute offsets of time sequences, so two time sequences that have similar shapes but with different vertical positions may be classified as dissimilar. The minimum distance is a more suitable similarity measurement than the Euclidean distance in many applications, where the shape of time series is a major consideration. To support minimum distance queries, most of previous work has the preprocessing step of vertical shifting that normalizes each time sequence by its mean before indexing. In this paper, we propose a novel and fast indexing scheme, called the segmented mean variation indexing(SMV-indexing). Our indexing scheme can match time series of similar shapes without vertical shifting and guarantees no false dismissals. Several experiments are performed on real data(stock price movement) to measure the performance of our indexing scheme. Experiments show that the SMV-indexing is more efficient than the sequential scanning in performance.