Exact indexing for massive time series databases under time warping distance

Authors:
Vit Niennattrakul;Pongsakorn Ruengronghirunya;Chotirat Ann Ratanamahatana
Affiliations:
Department of Computer Engineering, Chulalongkorn University, Bangkok, Thailand;Department of Computer Engineering, Chulalongkorn University, Bangkok, Thailand;Department of Computer Engineering, Chulalongkorn University, Bangkok, Thailand
Venue:
Data Mining and Knowledge Discovery
Year:
2010

Citing 21
Cited 4

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Scaling up dynamic time warping for datamining applications

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases

Proceedings of the 17th International Conference on Data Engineering
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration

Data Mining and Knowledge Discovery
Warping indexes with envelope transforms for query by humming

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases

Data Mining and Knowledge Discovery
Exact indexing of dynamic time warping

Knowledge and Information Systems
FTW: fast similarity search under the time warping distance

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Structural Periodic Measures for Time-Series Data

Data Mining and Knowledge Discovery
A Bit Level Representation for Time Series Data Mining with Shape Based Similarity

Data Mining and Knowledge Discovery
Characteristic-Based Clustering for Time Series Data

Data Mining and Knowledge Discovery
Experiencing SAX: a novel symbolic representation of time series

Data Mining and Knowledge Discovery
The TS-tree: efficient time series search and retrieval

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Querying and mining of time series data: experimental comparison of representations and distance measures

Proceedings of the VLDB Endowment

Boundary-based lower-bound functions for dynamic time warping and their indexing

Information Sciences: an International Journal
Shape-based template matching for time series data

Knowledge-Based Systems
How many reference patterns can improve profitability for real-time trading in futures market?

Expert Systems with Applications: An International Journal
Shape-Based clustering for time series data

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Among many existing distance measures for time series data, Dynamic Time Warping (DTW) distance has been recognized as one of the most accurate and suitable distance measures due to its flexibility in sequence alignment. However, DTW distance calculation is computationally intensive. Especially in very large time series databases, sequential scan through the entire database is definitely impractical, even with random access that exploits some index structures since high dimensionality of time series data incurs extremely high I/O cost. More specifically, a sequential structure consumes high CPU but low I/O costs, while an index structure requires low CPU but high I/O costs. In this work, we therefore propose a novel indexed sequential structure called TWIST (Time Warping in Indexed Sequential sTructure) which benefits from both sequential access and index structure. When a query sequence is issued, TWIST calculates lower bounding distances between a group of candidate sequences and the query sequence, and then identifies the data access order in advance, hence reducing a great number of both sequential and random accesses. Impressively, our indexed sequential structure achieves significant speedup in a querying process. In addition, our method shows superiority over existing rival methods in terms of query processing time, number of page accesses, and storage requirement with no false dismissal guaranteed.