Efficient Similarity Search over Future Stream Time Series

Authors:
Xiang Lian;Lei Chen
Affiliations:
-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2008

Citing 43
Cited 7

Unsupervised Optimal Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
A deterministic annealing approach to clustering

Pattern Recognition Letters
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Bayesian interpolation

Neural Computation
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Finding patterns in time series: a dynamic programming approach

Advances in knowledge discovery and data mining
Optimal multi-step k-nearest neighbor search

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Dimensionality reduction for similarity searching in dynamic databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Adaptive query processing for time-series data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continually evaluating similarity-based pattern queries on a streaming time series

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Evaluating continuous nearest neighbor queries for streaming time series via pre-fetching

Proceedings of the eleventh international conference on Information and knowledge management
Supporting Movement Pattern Queries in User-Specified Scales

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Distance Measures for Effective Clustering of ARIMA Time-Series

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Finding Similar Time Series

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Extendible Hash for Multi-Precision Similarity Querying of Image Databases

Proceedings of the 27th International Conference on Very Large Data Bases
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases

Proceedings of the 17th International Conference on Data Engineering
Finding surprising patterns in a time series database in linear time and space

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Supporting Content-Based Searches on Time Series via Approximation

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Predicting Rare Events In Temporal Domains

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Warping indexes with envelope transforms for query by humming

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Similarity Search Over Time-Series Data Using Wavelets

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Discovering Similar Multidimensional Trajectories

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A symbolic representation of time series, with implications for streaming algorithms

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient elastic burst detection in data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Similarity Search in Streaming Time Sequences

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Online event-driven subsequence matching over financial data streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Indexing spatio-temporal trajectories with Chebyshev polynomials

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A Unified Framework for Monitoring Data Streams in Real Time

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Robust and fast similarity search for moving object trajectories

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Contour map matching for event detection in sensor networks

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Multidimensional reverse kNN search

The VLDB Journal — The International Journal on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Reverse kNN search in arbitrary dimensionality

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
On the marriage of Lp-norms and edit distance

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A simple randomized algorithm for sequential prediction of ergodic time series

IEEE Transactions on Information Theory
Financial time series prediction using least squares support vector machines within the evidence framework

IEEE Transactions on Neural Networks

Cluster-based genetic segmentation of time series with DWT

Pattern Recognition Letters
A wavelet-based sampling algorithm for wireless sensor networks applications

Proceedings of the 2010 ACM Symposium on Applied Computing
Fast approximate correlation for massive time-series data

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A review on time series data mining

Engineering Applications of Artificial Intelligence
An intelligent prediction system for time series data using periodic pattern mining in temporal databases

Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia
Time-series data mining

ACM Computing Surveys (CSUR)
Searching similar segments over textual event sequences

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the advance of hardware and communication technologies, stream time series is gaining ever-increasing attention due to its importance in many applications, such as financial data processing, network monitoring, web click-stream analysis, sensor data mining and anomaly detection. For all these applications, an efficient and effective similarity search over stream data is essential. Even though many approaches have been proposed for searching through archived data, because of the unique characteristics of the stream, for example, data are frequently updated and real-time response is required, traditional methods may not work in these stream scenarios. Especially, for the cases where the arrival of data is often delayed for various reasons, for example, the communication congestion or batch processing and so on, queries on such incomplete time series or even future time series may result in inaccuracy using the traditional approaches. Therefore, in this paper we propose three approaches, polynomial, DFT and probabilistic, to predict the unknown values that have not arrived at the system and answer the queries based on the predicated data. We also present efficient indexes, that is, a multidimensional hash index and B+-tree, to facilitate the prediction and similarity search on future time series, respectively. Extensive experiments demonstrate the efficiency and effectiveness of our methods in terms of I/O, prediction and query accuracy