In search of meaning for time series subsequence clustering: matching algorithms based on a new distance measure

  • Authors:
  • Dina Goldin;Ricardo Mardales;George Nagy

  • Affiliations:
  • Brown University, Providence, RI;University of Connecticut, Storrs, CT;Rensselaer Polytechnic Inst., Troy, NY

  • Venue:
  • CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent papers have claimed that the result of K-means clustering for time series subsequences (STS clustering) is independent of the time series that created it. Our paper revisits this claim. In particular, we consider the following question: Given several time series sequences and a set of STS cluster centroids from one of them (generated by the K-means algorithm), is it possible to reliably determine which of the sequences produced these cluster centroids? While recent results suggest that the answer should be NO, we answer this question in the affirmative.We present cluster shape distance, an alternate distance measure for time series subsequence clusters, based on cluster shapes. Given a set of clusters, its shape is the sorted list of the pairwise Euclidean distances between their centroids. We then present two algorithms based on this distance measure, which match a set of STS cluster centroids with the time series that produced it. While the first algorithm creates DQG reuse this term more smaller "fingerprints" for the sequences, the second is more accurate. In our experiments with a dataset of 10 sequences, it produced a correct match 100% of the time.Furthermore, we offer an analysis that explains why our cluster shape distance provides a reliable way to match STS clusters to the original sequences, whereas cluster set distance fails to do so. Our work establishes for the first time a strong relation between the result of K-means STS clustering and the time series sequence that created it, despite earlier predictions that this is not possible.