Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Indexing large metric spaces for similarity search queries
ACM Transactions on Database Systems (TODS)
Scaling up dynamic time warping for datamining applications
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
General match: a subsequence matching method in time-series databases based on generalized windows
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Duality-Based Subsequence Matching in Time-Series Databases
Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Navigating nets: simple algorithms for proximity search
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
n-gram/2L: a space and time efficient two-level n-gram inverted index structure
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Cover trees for nearest neighbor
ICML '06 Proceedings of the 23rd international conference on Machine learning
Reference-based indexing of sequence databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Exact indexing of dynamic time warping
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
On the marriage of Lp-norms and edit distance
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Fast nGram-based string search over data encoded using algebraic signatures
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Ranked subsequence matching in time-series databases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Compressed indexing and local alignment of DNA
Bioinformatics
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximate embedding-based subsequence matching of time series
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Reference-based indexing for metric spaces with costly distance measures
The VLDB Journal — The International Journal on Very Large Data Bases
Reference-based alignment in large sequence databases
Proceedings of the VLDB Endowment
Anticipatory DTW for efficient similarity search in time series databases
Proceedings of the VLDB Endowment
Accelerating Dynamic Time Warping Subsequence Search with GPUs and FPGAs
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
WHAM: a high-throughput sequence alignment method
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A new approach for processing ranked subsequence matching based on ranked union
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Trajectory Analysis and Semantic Region Modeling Using Nonparametric Hierarchical Bayesian Models
International Journal of Computer Vision
Searching and mining trillions of time series subsequences under dynamic time warping
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
RCSI: scalable similarity search in thousand(s) of genomes
Proceedings of the VLDB Endowment
Discovering longest-lasting correlation in sequence databases
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
This paper proposes a general framework for matching similar subsequences in both time series and string databases. The matching results are pairs of query subsequences and database subsequences. The framework finds all possible pairs of similar subsequences if the distance measure satisfies the "consistency" property, which is a property introduced in this paper. We show that most popular distance functions, such as the Euclidean distance, DTW, ERP, the Frechét distance for time series, and the Hamming distance and Levenshtein distance for strings, are all "consistent". We also propose a generic index structure for metric spaces named "reference net". The reference net occupies O(n) space, where n is the size of the dataset and is optimized to work well with our framework. The experiments demonstrate the ability of our method to improve retrieval performance when combined with diverse distance measures. The experiments also illustrate that the reference net scales well in terms of space overhead and query time.