The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
ACM Computing Surveys (CSUR)
Multidimensional binary search trees used for associative searching
Communications of the ACM
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve
IEEE Transactions on Knowledge and Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods
Proceedings of the 17th International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Approximate String Joins in a Database (Almost) for Free
Proceedings of the 27th International Conference on Very Large Data Bases
Mining Motifs in Massive Time Series Databases
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
On Similarity-Based Queries for Time Series Data
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Data Mining and Knowledge Discovery
Optimizing Similarity Search for Arbitrary Length Time Series Queries
IEEE Transactions on Knowledge and Data Engineering
Hyperspectral Imaging: Techniques for Spectral Detection and Classification
Hyperspectral Imaging: Techniques for Spectral Detection and Classification
BRAID: stream mining through group lag correlations
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Fast window correlations over uncooperative time series
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Experiencing SAX: a novel symbolic representation of time series
Data Mining and Knowledge Discovery
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Using multiple indexes for efficient subsequence matching in time-series databases
Information Sciences: an International Journal
OASIS: an online and accurate technique for local-alignment searches on biological sequences
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The TS-tree: efficient time series search and retrieval
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Approximate embedding-based subsequence matching of time series
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the VLDB Endowment
Comparative Evaluation of Anomaly Detection Techniques for Sequence Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
The VLDB Journal — The International Journal on Very Large Data Bases
Clustering of time series data-a survey
Pattern Recognition
Fast approximate correlation for massive time-series data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
iSAX 2.0: Indexing and Mining One Billion Time Series
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Embedding-based subsequence matching in time-series databases
ACM Transactions on Database Systems (TODS)
Logical-shapelets: an expressive primitive for time series classification
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Prominent streak discovery in sequence data
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Searching and mining trillions of time series subsequences under dynamic time warping
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A generic framework for efficient and effective subsequence retrieval
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Most existing work on sequence databases use correlation (e.g., Euclidean distance and Pearson correlation) as a core function for various analytical tasks. Typically, it requires users to set a length for the similarity queries. However, there is no steady way to define the proper length on different application needs. In this work we focus on discovering longest-lasting highly correlated subsequences in sequence databases, which is particularly useful in helping those analyses without prior knowledge about the query length. Surprisingly, there has been limited work on this problem. A baseline solution is to calculate the correlations for every possible subsequence combination. Obviously, the brute force solution is not scalable for large datasets. In this work we study a space-constrained index that gives a tight correlation bound for subsequences of similar length and offset by intra-object grouping and inter-object grouping techniques. To the best of our knowledge, this is the first index to support normalized distance metric of arbitrary length subsequences. Extensive experimental evaluation on both real and synthetic sequence datasets verifies the efficiency and effectiveness of our proposed methods.