An Indexing Scheme for Fast Similarity Search in Large Time Series Databases

Authors:
Eamonn J. Keogh;Michael J. Pazzani
Affiliations:
-;-
Venue:
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Year:
1999

Citing 0
Cited 9

Deformable Markov model templates for time-series pattern matching

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling up dynamic time warping for datamining applications

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Time series similarity measures (tutorial PM-2)

Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient and robust feature extraction and pattern matching of time series by a lattice structure

Proceedings of the tenth international conference on Information and knowledge management
MSTS: A System for Mining Sets of Time Series

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Dependency detection in MobiMine: a systems perspective

Information Sciences—Informatics and Computer Science: An International Journal - special issue: Knowledge discovery from distributed information sources
Correlation analysis of spatial time series datasets: a filter-and-refine approach

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
SciQL: bridging the gap between science and relational DBMS

Proceedings of the 15th Symposium on International Database Engineering & Applications
Partially ordered template-based matching algorithm for financial time series

IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of similarity search in large time series databases. We introduce a novel indexing algorithm that allows faster retrieval. The index is formed by creating bins that contain time series subsequences of approximately the same shape. For each bin, we can quickly calculate a lower-bound on the distance between a given query and the most similar element of the bin. This bound allows us to search the bins in best first order, and to prune some bins from the search space without having to examine the contents. Additional speedup is obtained by optimizing the data within the bins such that we can avoid having to compare the query to every item in the bin. We call our approach STB-indexing and experimentally validate it on space telemetry, medical and synthetic data, demonstrating approximately an order of magnitude speed-up.