Making clustering in delay-vector space meaningful

Authors:
Jason R. Chen
Affiliations:
The Australian National University, Department of Information Engineering, Research School of Information Science and Engineering, College of Engineering and Computer Science, ACT 0200, Canberra, ...
Venue:
Knowledge and Information Systems
Year:
2007

Citing 11
Cited 3

MALM: a framework for mining sequence database at multiple abstraction levels

Proceedings of the seventh international conference on Information and knowledge management
Nonlinear time series analysis

Nonlinear time series analysis
Identifying distinctive subsequences in multivariate time series by clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A Survey of Temporal Knowledge Discovery Paradigms and Methods

IEEE Transactions on Knowledge and Data Engineering
Classification Rules + Time = Temporal Rules

ICCS '02 Proceedings of the International Conference on Computational Science-Part I
Indexing and Mining of the Local Patterns in Sequence Database

IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Discovering Sequential Association Rules with Constraints and Time Lags in Multiple Sequences

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
A Motion Recognition Method by Using Primitive Motions

VDB 5 Proceedings of the Fifth Working Conference on Visual Database Systems: Advances in Visual Information Management
Maintaining variance and k-medians over data stream windows

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Fuzzy-Set-Based Reconstructed Phase Space Method for Idenitification of Temporal Patterns in Complex Time Series

IEEE Transactions on Knowledge and Data Engineering

Useful clustering outcomes from meaningful time series clustering

AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Establishing relationships among patterns in stock market data

Data & Knowledge Engineering
Data mining of vector–item patterns using neighborhood histograms

Knowledge and Information Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

Sequential time series clustering is a technique used to extract important features from time series data. The method can be shown to be the process of clustering in the delay-vector space formalism used in the Dynamical Systems literature. Recently, the startling claim was made that sequential time series clustering is meaningless. This has important consequences for a significant amount of work in the literature, since such a claim invalidates these work’s contribution. In this paper, we show that sequential time series clustering is not meaningless, and that the problem highlighted in these works stem from their use of the Euclidean distance metric as the distance measure in the delay-vector space. As a solution, we consider quite a general class of time series, and propose a regime based on two types of similarity that can exist between delay vectors, giving rise naturally to an alternative distance measure to Euclidean distance in the delay-vector space. We show that, using this alternative distance measure, sequential time series clustering can indeed be meaningful. We repeat a key experiment in the work on which the “meaningless” claim was based, and show that our method leads to a successful clustering outcome.