Quantizing time series for efficient subsequence matching

Authors:
Inés F. Vega-López;Bongki Moon
Affiliations:
School of Informatics, Autonomous University of Sinaloa, Culiacán, Sinaloa, México;Department of Computer Science, University of Arizona, Tucson, AZ
Venue:
DBA'06 Proceedings of the 24th IASTED international conference on Database and applications
Year:
2006

Citing 13
Cited 0

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Comparison of access methods for time-evolving data

ACM Computing Surveys (CSUR)
FinTime: a financial time series benchmark

ACM SIGMOD Record
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Searching Multimedia Databases by Content

Searching Multimedia Databases by Content
General match: a subsequence matching method in time-series databases based on generalized windows

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Duality-Based Subsequence Matching in Time-Series Databases

Proceedings of the 17th International Conference on Data Engineering
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
On the need for time series data mining benchmarks: a survey and empirical demonstration

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Indexing time series data is an interesting problem that has attracted much interest in the research community for the last decade. Traditional indexing methods organize the data space using different metrics. However, searching high-dimensional spaces using a hierarchical index is not always efficient because a large portion of the index might need to be accessed during search. We have revisited this problem of matching subsequences in light of new technological advances. In particular, we have paid close attention to the increasing ratio of CPU to disk performance. We recognize this problem is heavily bound by IO operations and address this issue in a twofold manner. First, we propose the use of quantization to generate small and homogeneous representations of time series. Quantization provides tight upper- and lower-bounds on the measure of similarity to a query sequence. This allows us to drastically reduce the number of false alarms during search. Second, we organize the quantized representation of data in a linear array that can be efficiently read from disk. By reducing the number of false alarms and by sequentially reading the index, we are able to significantly reduce the IO cost of query processing. In consequence, we improve the overall search performance by up to a factor of 3 with respect to state of the art techniques for subsequence matching.