Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases

Authors:
Sanghyun Park;Sang-Wook Kim;June-Suh Cho;Sriram Padmanabhan
Affiliations:
IBM T.J. Watson Research Center;Kangwon National University;IBM T.J. Watson Research Center;IBM T.J. Watson Research Center
Venue:
Proceedings of the tenth international conference on Information and knowledge management
Year:
2001

Citing 21
Cited 12

Computational geometry: an introduction

Computational geometry: an introduction
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fundamentals of speech recognition

Fundamentals of speech recognition
On packing R-trees

CIKM '93 Proceedings of the second international conference on Information and knowledge management
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
String searching algorithms

String searching algorithms
Similarity-based queries for time series data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Finding patterns in time series: a dynamic programming approach

Advances in knowledge discovery and data mining
Fast time-series searching with scaling and shifting

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
High-Dimensional Similarity Joins

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Finding Similar Time Series

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
On Similarity Queries for Time-Series Data: Constraint Specification and Implementation

CP '95 Proceedings of the First International Conference on Principles and Practice of Constraint Programming
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases

Proceedings of the 17th International Conference on Data Engineering
Efficient Searches for Similar Subsequences of Different Lengths in Sequence Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Shape-based retrieval of similar subsequences in time-series databases

Proceedings of the 2002 ACM symposium on Applied computing
Similarity search of time-warped subsequences via a suffix tree

Information Systems
Performance bottleneck in time-series subsequence matching

Proceedings of the 2005 ACM symposium on Applied computing
Optimization of subsequence matching under time warping in time-series databases

Proceedings of the 2005 ACM symposium on Applied computing
A segment-wise time warping method for time scaling searching

Information Sciences—Informatics and Computer Science: An International Journal
Shape-based retrieval in time-series databases

Journal of Systems and Software
Prefix-querying with anL1 distance metric for time-series subsequence matching under time warping

Journal of Information Science
A Piecewise Linear Representation Method of Time Series Based on Feature Points

KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
A segment-wise time warping method for time scaling searching

Information Sciences: an International Journal
A review on time series data mining

Engineering Applications of Artificial Intelligence
Boundary-based lower-bound functions for dynamic time warping and their indexing

Information Sciences: an International Journal
An index-based time-series subsequence matching under time warping

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multi-dimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality.In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.