Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Closest pair queries in spatial databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Multidimensional divide-and-conquer
Communications of the ACM
AlphaSort: a cache-sensitive parallel external sort
The VLDB Journal — The International Journal on Very Large Data Bases
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
IEEE Transactions on Knowledge and Data Engineering
Efficient Color Histogram Indexing for Quadratic Form Distance Functions
IEEE Transactions on Pattern Analysis and Machine Intelligence
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
C2P: Clustering based on Closest Pairs
Proceedings of the 27th International Conference on Very Large Data Bases
Mining Motifs in Massive Time Series Databases
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
High Performance Data Mining Using the Nearest Neighbor Join
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Probabilistic discovery of time series motifs
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Learning recurrent behaviors from heterogeneous multivariate time-series
Artificial Intelligence in Medicine
Efficient index-based KNN join processing for high-dimensional data
Information and Software Technology
Detecting time series motifs under uniform scaling
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge construction from time series data using a collaborative exploration system
Journal of Biomedical Informatics
Declarative querying for biological sequences
Declarative querying for biological sequences
iSAX: indexing and mining terabyte sized time series
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering original motifs with different lengths from time series
Knowledge-Based Systems
Effective Proximity Retrieval by Ordering Permutations
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Proceedings of the VLDB Endowment
Discovering multivariate motifs using subsequence density estimation and greedy mixture learning
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Proceedings of the 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation
Mining approximate motifs in time series
DS'06 Proceedings of the 9th international conference on Discovery Science
Locating motifs in time-series data
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Searching and mining trillions of time series subsequences under dynamic time warping
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Data mining a trillion time series subsequences under dynamic time warping
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data mining algorithms including classification, clustering, rule-discovery and summarization. In spite of extensive research in recent years, finding time series motifs exactly in massive databases is an open problem. Previous efforts either found approximate motifs or considered relatively small datasets residing in main memory. In this work, we leverage off previous work on pivot-based indexing to introduce a disk-aware algorithm to find time series motifs exactly in multi-gigabyte databases which contain on the order of tens of millions of time series. We have evaluated our algorithm on datasets from diverse areas including medicine, anthropology, computer networking and image processing and show that we can find interesting and meaningful motifs in datasets that are many orders of magnitude larger than anything considered before.