Useful clustering outcomes from meaningful time series clustering

Authors:
Jason R. Chen
Affiliations:
The Australian National University, Canberra, ACT, Australia
Venue:
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Year:
2007

Citing 7
Cited 3

Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Clustering through ranking on manifolds

ICML '05 Proceedings of the 22nd international conference on Machine learning
Kernel-Density-Based Clustering of Time Series Subsequences Using a Continuous Random-Walk Noise Model

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Unfolding preprocessing for meaningful time series clustering

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
In search of meaning for time series subsequence clustering: matching algorithms based on a new distance measure

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Making clustering in delay-vector space meaningful

Knowledge and Information Systems
Linear manifold clustering

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition

Compensation of Translational Displacement in Time Series Clustering Using Cross Correlation

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
A review on time series data mining

Engineering Applications of Artificial Intelligence
Comparison of unsupervised Arrhythmia classification techniques

Proceedings of the International Conference & Workshop on Emerging Trends in Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering time series data using the popular subsequence (STS) technique has been widely used in the data mining and wider communities. Recently the conclusion was made that it is meaningless, based on the findings that it produces (a) clustering outcomes for distinct time series that are not distinguishable from one another, and (b) cluster centroids that are smoothed. More recent work has since showed that (a) could be solved by introducing a lag in the subsequence vector construction process, however we show in this paper that such an approach does not solve (b). Motivating the terminology that a clustering method which overcomes (a) is meaningful, while one which overcomes (a) and (b) is useful, we propose an approach that produces useful time series clustering. The approach is based on restricting the clustering space to extend only over the region visited by the time series in the subsequence vector space. We test the approach on a set of 12 diverse real-world and synthetic data sets and find that (a) one can distinguish between the clusterings of these time series, and (b) that the centroids produced in each case retain the character of the underlying series from which they came.