Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Finding patterns in time series: a dynamic programming approach
Advances in knowledge discovery and data mining
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Distance browsing in spatial databases
ACM Transactions on Database Systems (TODS)
A comparison of DFT and DWT based similarity search in time-series databases
Proceedings of the ninth international conference on Information and knowledge management
Locally adaptive dimensionality reduction for indexing large time series databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Variable Length Queries for Time Series Data
Proceedings of the 17th International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Warping indexes with envelope transforms for query by humming
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Online Amnesic Approximation of Streaming Time Series
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Online event-driven subsequence matching over financial data streams
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Indexing spatio-temporal trajectories with Chebyshev polynomials
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Robust and fast similarity search for moving object trajectories
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Contour map matching for event detection in sensor networks
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Indexing Multidimensional Time-Series
The VLDB Journal — The International Journal on Very Large Data Bases
An efficient and accurate method for evaluating time series similarity
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Trajectory clustering: a partition-and-group framework
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
On the marriage of Lp-norms and edit distance
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Proceedings of the VLDB Endowment
Finding Motifs of Financial Data Streams in Real Time
ISICA '08 Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
iSAX: disk-aware mining and indexing of massive time series datasets
Data Mining and Knowledge Discovery
Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data
The VLDB Journal — The International Journal on Very Large Data Bases
GAMPS: compressing multi sensor data by grouping and amplitude scaling
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Top-k queries on temporal data
The VLDB Journal — The International Journal on Very Large Data Bases
An efficient approach for human motion data mining based on curves matching
ICCVG'10 Proceedings of the 2010 international conference on Computer vision and graphics: Part I
A framework for time-series analysis
AIMSA'10 Proceedings of the 14th international conference on Artificial intelligence: methodology, systems, and applications
A review on time series data mining
Engineering Applications of Artificial Intelligence
Scalable kNN search on vertically stored time series
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate query on historical stream data
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets
Proceedings of the 15th International Conference on Extending Database Technology
Proceedings of the VLDB Endowment
ACM Computing Surveys (CSUR)
Experimental comparison of representation methods and distance measures for time series data
Data Mining and Knowledge Discovery
A representation of time series based on implicit polynomial curve
Pattern Recognition Letters
Optimal splitters for temporal and multi-version databases
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
A data-adaptive and dynamic segmentation index for whole matching on time series
Proceedings of the VLDB Endowment
Hi-index | 0.01 |
Similarity-based search over time-series databases has been a hot research topic for a long history, which is widely used in many applications, including multimedia retrieval, data mining, web search and retrieval, and so on. However, due to high dimensionality (i.e. length) of the time series, the similarity search over directly indexed time series usually encounters a serious problem, known as the "dimensionality curse". Thus, many dimensionality reduction techniques are proposed to break such curse by reducing the dimensionality of time series. Among all the proposed methods, only Piecewise Linear Approximation (PLA) does not have indexing mechanisms to support similarity queries, which prevents it from efficiently searching over very large time-series databases. Our initial studies on the effectiveness of different reduction methods, however, show that PLA performs no worse than others. Motivated by this, in this paper, we re-investigate PLA for approximating and indexing time series. Specifically, we propose a novel distance function in the reduced PLA-space, and prove that this function indeed results in a lower bound of the Euclidean distance between the original time series, which can lead to no false dismissals during the similarity search. As a second step, we develop an effective approach to index these lower bounds to improve the search efficiency. Our extensive experiments over a wide spectrum of real and synthetic data sets have demonstrated the efficiency and effectiveness of PLA together with the newly proposed lower bound distance, in terms of both pruning power and wall clock time, compared with two state-of-the-art reduction methods, Adaptive Piecewise Constant Approximation (APCA) and Chebyshev Polynomials (CP).