Making Subsequence Time Series Clustering Meaningful

Authors:
Jason R. Chen
Affiliations:
Australian National University
Venue:
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Year:
2005

Citing 7
Cited 8

Nonlinear time series analysis

Nonlinear time series analysis
Identifying distinctive subsequences in multivariate time series by clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A Survey of Temporal Knowledge Discovery Paradigms and Methods

IEEE Transactions on Knowledge and Data Engineering
Maintaining variance and k-medians over data stream windows

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Fuzzy-Set-Based Reconstructed Phase Space Method for Idenitification of Temporal Patterns in Complex Time Series

IEEE Transactions on Knowledge and Data Engineering

In search of meaning for time series subsequence clustering: matching algorithms based on a new distance measure

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Cluster-based genetic segmentation of time series with DWT

Pattern Recognition Letters
A data mining framework for time series estimation

Journal of Biomedical Informatics
A review on time series data mining

Engineering Applications of Artificial Intelligence
Why does subsequence time-series clustering produce sine waves?

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Substructure clustering: a novel mining paradigm for arbitrary data types

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Visual data mining for identification of patterns and outliers in weather stations' data

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Short communication: Selective Subsequence Time Series clustering

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, the startling claim was made that sequential time series clustering is meaningless. This has important consequences for a significant amount of work in the literature, since such a claim invalidates this work's contribution. In this paper, we show that sequential time series clustering is not meaningless, and that the problem highlighted in these works stem from their use of the Euclidean distance metric as the distance measure in the subsequence vector space. As a solution, we consider quite a general class of time series, and propose a regime based on two types of similarity that can exist between subsequence vectors, which give rise naturally to an alternative distance measure to Euclidean distance in the subsequence vector space. We show that, using this alternative distance measure, sequential time series clustering can indeed be meaningful. We repeat a key experiment in the work on which the "meaningless" claim was based, and show that our method leads to a successful clustering outcome.