Efficient Similarity Search for Time Series Data Based on the Minimum Distance

Authors:
Sangjun Lee;Dongseop Kwon;Sukho Lee
Affiliations:
-;-;-
Venue:
CAiSE '02 Proceedings of the 14th International Conference on Advanced Information Systems Engineering
Year:
2002

Citing 24
Cited 0

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Similarity-based queries for time series data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Time-series similarity problems and well-separated geometric sets

SCG '97 Proceedings of the thirteenth annual symposium on Computational geometry
Matching and indexing sequences of different lengths

CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
A fast projection algorithm for sequence data searching

Data & Knowledge Engineering - Special issue: next generation information technologies and systems
Fast time-series searching with scaling and shifting

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Database Mining: A Performance Perspective

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
HierarchyScan: A Hierarchical Similarity Search Algorithm for Databases of Long Sequences

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Finding Similar Time Series

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Sequence Matching of Images

SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
On Similarity Queries for Time-Series Data: Constraint Specification and Implementation

CP '95 Proceedings of the First International Conference on Principles and Practice of Constraint Programming
On Similarity-Based Queries for Time Series Data

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient Searches for Similar Subsequences of Different Lengths in Sequence Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of efficient similarity search based on the minimum distance in large time series databases. Most of previous work is focused on similarity matching and retrieval of time series based on the Euclidean distance. However, as we demonstrate in this paper, the Euclidean distance has limitations as a similarity measurement. It is sensitive to the absolute offsets of time sequences, so two time sequences that have similar shapes but with different vertical positions may be classified as dissimilar. The minimum distance is a more suitable similarity measurement than the Euclidean distance in many applications, where the shape of time series is a major consideration. To support minimum distance queries, most of previous work has the preprocessing step of vertical shifting that normalizes each time sequence by its mean before indexing. In this paper, we propose a novel and fast indexing scheme, called the segmented mean variation indexing(SMV-indexing). Our indexing scheme can match time series of similar shapes without vertical shifting and guarantees no false dismissals. Several experiments are performed on real data(stock price movement) to measure the performance of our indexing scheme. Experiments show that the SMV-indexing is more efficient than the sequential scanning in performance.