What every computer scientist should know about floating-point arithmetic
ACM Computing Surveys (CSUR)
Efficient Retrieval of Similar Time Sequences Under Time Warping
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases
Proceedings of the 17th International Conference on Data Engineering
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Data Mining and Knowledge Discovery
Indexing multi-dimensional time-series with support for multiple distance measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing Similarity Search for Arbitrary Length Time Series Queries
IEEE Transactions on Knowledge and Data Engineering
Exact indexing of dynamic time warping
Knowledge and Information Systems
FTW: fast similarity search under the time warping distance
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Using multiple indexes for efficient subsequence matching in time-series databases
Information Sciences: an International Journal
Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes
Proceedings of the 20th annual ACM symposium on User interface software and technology
Indexing large human-motion databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
On the marriage of Lp-norms and edit distance
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The TS-tree: efficient time series search and retrieval
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Scaling and time warping in time series querying
The VLDB Journal — The International Journal on Very Large Data Bases
iSAX: indexing and mining terabyte sized time series
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the VLDB Endowment
Gestures are strings: efficient online gesture spotting and classification using string matching
Proceedings of the ICST 2nd international conference on Body area networks
Finding anomalous periodic time series
Machine Learning
Efficient Processing of Warping Time Series Join of Motion Capture Data
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
The VLDB Journal — The International Journal on Very Large Data Bases
Time series shapelets: a new primitive for data mining
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
TSPad: a Tablet-PC based application for annotation and collaboration on time series data
Proceedings of the 46th Annual Southeast Regional Conference on XX
Enabling Efficient Time Series Analysis for Wearable Activity Data
ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
Online discovery and maintenance of time series motifs
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A disk-aware algorithm for time series motif discovery
Data Mining and Knowledge Discovery
Embedding-based subsequence matching in time-series databases
ACM Transactions on Database Systems (TODS)
Identification of ancient coins based on fusion of shape and local features
Machine Vision and Applications
Cardiac arrhythmia detection using dynamic time warping of ECG beats in e-healthcare systems
WOWMOM '11 Proceedings of the 2011 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks
Hi-index | 0.00 |
Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms, including classification, clustering, motif discovery, anomaly detection, and so on. The difficulty of scaling a search to large datasets explains to a great extent why most academic work on time series data mining has plateaued at considering a few millions of time series objects, while much of industry and science sits on billions of time series objects waiting to be explored. In this work we show that by using a combination of four novel ideas we can search and mine massive time series for the first time. We demonstrate the following unintuitive fact: in large datasets we can exactly search under Dynamic Time Warping (DTW) much more quickly than the current state-of-the-art Euclidean distance search algorithms. We demonstrate our work on the largest set of time series experiments ever attempted. In particular, the largest dataset we consider is larger than the combined size of all of the time series datasets considered in all data mining papers ever published. We explain how our ideas allow us to solve higher-level time series data mining problems such as motif discovery and clustering at scales that would otherwise be untenable. Moreover, we show how our ideas allow us to efficiently support the uniform scaling distance measure, a measure whose utility seems to be underappreciated, but which we demonstrate here. In addition to mining massive datasets with up to one trillion datapoints, we will show that our ideas also have implications for real-time monitoring of data streams, allowing us to handle much faster arrival rates and/or use cheaper and lower powered devices than are currently possible.