Visually mining and monitoring massive time series
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Visualizing and discovering non-trivial patterns in large time series databases
Information Visualization
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Knowledge and Information Systems
Mining sequential patterns across time sequences
New Generation Computing
Temporal pattern matching for the prediction of stock prices
AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
Characterizing individual communication patterns
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
On privacy in time series data mining
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
DUST: a generalized notion of similarity between uncertain time series
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
DUST: a generalized notion of similarity between uncertain time series
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A new class of attacks on time series data mining\m{1}
Intelligent Data Analysis
Lag patterns in time series databases
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Increasing availability of industrial systems through data stream mining
Computers and Industrial Engineering
Weighted dynamic time warping for time series classification
Pattern Recognition
Traffic events modeling for structural health monitoring
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Visual data mining for identification of patterns and outliers in weather stations' data
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Short communication: Selective Subsequence Time Series clustering
Knowledge-Based Systems
ACM Computing Surveys (CSUR)
Feature selection for classification of oscillating time series
Expert Systems: The Journal of Knowledge Engineering
Incremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams
International Journal of Data Warehousing and Mining
Preserving Privacy in Time Series Data Mining
International Journal of Data Warehousing and Mining
Artificial Intelligence in Medicine
DTW-D: time series semi-supervised learning from a single example
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Time series symbolization and search for frequent patterns
Proceedings of the Fourth Symposium on Information and Communication Technology
Hi-index | 0.00 |
Given the recent explosion of interest in streaming data and online algorithms, clustering of time-series subsequences, extracted via a sliding window, has received much attention. In this work, we make a surprising claim. Clustering of time-series subsequences is meaningless. More concretely, clusters extracted from these time series are forced to obey a certain constraint that is pathologically unlikely to be satisfied by any dataset, and because of this, the clusters extracted by any clustering algorithm are essentially random. While this constraint can be intuitively demonstrated with a simple illustration and is simple to prove, it has never appeared in the literature. We can justify calling our claim surprising because it invalidates the contribution of dozens of previously published papers. We will justify our claim with a theorem, illustrative examples, and a comprehensive set of experiments on reimplementations of previous work. Although the primary contribution of our work is to draw attention to the fact that an apparent solution to an important problem is incorrect and should no longer be used, we also introduce a novel method that, based on the concept of time-series motifs, is able to meaningfully cluster subsequences on some time-series datasets.