Continuous Similarity-Based Queries on Streaming Time Series

Authors:
Like Gao;Xiaoyang Sean Wang
Affiliations:
IEEE;IEEE Computer Society
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2005

Citing 20
Cited 4

Continuous queries over append-only databases

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Similarity-based queries for time series data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
High-dimensional index structures database support for next decade's applications (tutorial)

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Adaptive query processing for time-series data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Continuously adaptive continuous queries over streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
DFT/FFT and Convolution Algorithms: Theory and Implementation

DFT/FFT and Convolution Algorithms: Theory and Implementation
Continuous queries over data streams

ACM SIGMOD Record
Querying Time Series Data Based on Similarity

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Approximate Queries and Representations for Large Data Sequences

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Variable Length Queries for Time Series Data

Proceedings of the 17th International Conference on Data Engineering
Optimizations Enabled by Relational Data Model View to Querying Data Streams

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Design and Evaluation of Alternative Selection Placement Strategies in Optimizing Continuous Queries

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Fjording the Stream: An Architecture for Queries Over Streaming Sensor Data

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A simple randomized algorithm for sequential prediction of ergodic time series

IEEE Transactions on Information Theory

Effective variation management for pseudo periodical streams

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Temporal pattern matching for the prediction of stock prices

AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
PGG: an online pattern based approach for stream variation management

Journal of Computer Science and Technology
Fast likelihood search for hidden Markov models

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many applications, local or remote sensors send in streams of data, and the system needs to monitor the streams to discover relevant events/patterns and deliver instant reaction correspondingly. An important scenario is that the incoming stream is a continually appended time series, and the patterns are time series in a database. At each time when a new value arrives (called a time position), the system needs to find, from the database, the nearest or near neighbors of the incoming time series up to the time position. This paper attacks the problem by using Fast Fourier Transform (FFT) to efficiently find the cross correlations of time series, which yields, in a batch mode, the nearest and near neighbors of the incoming time series at many time positions. To take advantage of this batch processing in achieving fast response time, this paper uses prediction methods to predict future values. When the prediction length is long, FFT is used to compute the cross correlations of the predicted series (with the values that have already arrived) and the database patterns, and to obtain predicted distances between the incoming time series at many future time positions and the database patterns. If the prediction length is short, the direct computation method is used to obtain these predicted distances to avoid the overhead of using FFT. When the actual data value arrives, the prediction error together with the predicted distances is used to filter out patterns that are not possible to be the nearest or near neighbors, which provides fast responses. Experiments show that with reasonable prediction errors, the performance gain is significant. Especially, when the long term predictions are available, the proposed method can handle incoming data at a very fast streaming rate.