A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases

Authors:
Eamonn J. Keogh;Michael J. Pazzani
Affiliations:
-;-
Venue:
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Year:
2000

Citing 18
Cited 24

Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Similarity-based queries for time series data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A Review and Empirical Evaluation of Feature Weighting Methods for aClass of Lazy Learning Algorithms

Artificial Intelligence Review - Special issue on lazy learning
On the analysis of indexing schemes

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Dimensionality reduction for similarity searching in dynamic databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Relevance feedback retrieval of time series data

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive query processing for time-series data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Approximate Queries and Representations for Large Data Sequences

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
The Haar Wavelet Transform in the Time Series Similarity Paradigm

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Fast Retrieval of Similar Subsequences in Long Sequence Databases

KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The Fourier Transform - A Primer

The Fourier Transform - A Primer

Efficient Pattern Matching of Time Series Data

IEA/AIE '02 Proceedings of the 15th international conference on Industrial and engineering applications of artificial intelligence and expert systems: developments in applied artificial intelligence
Efficient Similarity Search for Time Series Data Based on the Minimum Distance

CAiSE '02 Proceedings of the 14th International Conference on Advanced Information Systems Engineering
Efficient Subsequence Matching in Time Series Databases Under Time and Amplitude Transformations

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Multiresolution Symbolic Representation of Time Series

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Mining for weak periodic signals in time series databases

Intelligent Data Analysis
Time series analysis with multiple resolutions

Information Systems
A Quick Filtering for Similarity Queries in Motion Capture Databases

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
A real time hybrid pattern matching scheme for stock time series

ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Temporal data mining using shape space representations of time series

Neurocomputing
A review on time series data mining

Engineering Applications of Artificial Intelligence
TIDES--a new descriptor for time series oscillation behavior

Geoinformatica
Synthesizing routes for low sampling trajectories with absorbing Markov chains

WAIM'11 Proceedings of the 12th international conference on Web-age information management
SciQL: bridging the gap between science and relational DBMS

Proceedings of the 15th Symposium on International Database Engineering & Applications
An algorithm for high-dimensional traffic data clustering

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Partially ordered template-based matching algorithm for financial time series

IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Continuous trend-based classification of streaming time series

ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
CANDELA – storage, analysis and retrieval of video content in distributed systems

AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets

Proceedings of the 15th International Conference on Extending Database Technology
Parsimonious temporal aggregation

The VLDB Journal — The International Journal on Very Large Data Bases
Multivariate time series segmentation for generalized description of dynamic systems operation

Optical Memory and Neural Networks
Adaptive pattern mining model for early detection of botnet-propagation scale

Security and Communication Networks
Estimating a user’s internal state before the first input utterance

Advances in Human-Computer Interaction
OBST-based segmentation approach to financial time series

Engineering Applications of Artificial Intelligence
Stock market co-movement assessment using a three-phase clustering method

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of similarity search in large time series databases. We introduce a novel-dimensionality reduction technique that supports an indexing algorithm that is more than an order of magnitude faster than the previous best known method. In addition to being much faster our approach has numerous other advantages. It is simple to understand and implement, allows more flexible distance measures including weighted Euclidean queries and the index can be built in linear time. We call our approach PCA-indexing (Piece-wise Constant Approximation) and experimentally validate it on space telemetry, financial, astronomical, medical and synthetic data.