Adaptive similarity search in streaming time series with sliding windows

Authors:
Maria Kontaki;Apostolos N. Papadopoulos;Yannis Manolopoulos
Affiliations:
Data Engineering Research Laboratory, Department of Informatics, Aristotle University, 54124 Thessaloniki, Greece;Data Engineering Research Laboratory, Department of Informatics, Aristotle University, 54124 Thessaloniki, Greece;Data Engineering Research Laboratory, Department of Informatics, Aristotle University, 54124 Thessaloniki, Greece
Venue:
Data & Knowledge Engineering
Year:
2007

Citing 18
Cited 6

Discrete-time signal processing

Discrete-time signal processing
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Optimal multi-step k-nearest neighbor search

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continually evaluating similarity-based pattern queries on a streaming time series

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Continuous queries over data streams

ACM SIGMOD Record
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Indexing the Current Positions of Moving Objects Using the Lazy Update R-tree

MDM '02 Proceedings of the Third International Conference on Mobile Data Management
Maintaining variance and k-medians over data stream windows

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Comparing Data Streams Using Hamming Norms (How to Zero In)

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search in Streaming Time Sequences

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Online event-driven subsequence matching over financial data streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Streaming queries over streaming data

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Supporting frequent updates in R-trees: a bottom-up approach

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

SNIF TOOL: sniffing for patterns in continuous streams

Proceedings of the 17th ACM conference on Information and knowledge management
Mining closed patterns in multi-sequence time-series databases

Data & Knowledge Engineering
Similarity analysis on nonstationary time series

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Similarity search in streaming time series based on MP_C dimensionality reduction method

ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part I
Time-series data mining

ACM Computing Surveys (CSUR)
Mining effective multi-segment sliding window for pathogen incidence rate prediction

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The challenge in a database of evolving time series is to provide efficient algorithms and access methods for query processing, taking into consideration the fact that the database changes continuously as new data become available. Traditional access methods that continuously update the data are considered inappropriate, due to significant update costs. In this paper, we use the IDC-Index (Incremental DFT Computation - Index), an efficient technique for similarity query processing in streaming time series. The index is based on a multidimensional access method enhanced with a deferred update policy and an incremental computation of the Discrete Fourier Transform (DFT), which is used as a feature extraction method. We focus both on range and nearest-neighbor queries, since both types are frequently used in modern applications. An important characteristic of the proposed approach is its ability to adapt to the update frequency of the data streams. By using a simple heuristic approach, we manage to keep the update frequency at a specified level to guarantee efficiency. In order to investigate the efficiency of the proposed method, experiments have been performed for range queries and k-nearest-neighbor queries on real-life data sets. The proposed method manages to reduce the number of false alarms examined, achieving high answers vs. candidates ratio. Moreover, the results have shown that the new techniques exhibit consistently better performance in comparison to previously proposed approaches.