State-space dynamics distance for clustering sequential data

Authors:
Darío García-García;Emilio Parrado-Hernández;Fernando Diaz-de-Maria
Affiliations:
Signal Theory and Communications Deparment, Escuela Politécnica Superior, Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911 Leganés, Spain;Signal Theory and Communications Deparment, Escuela Politécnica Superior, Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911 Leganés, Spain;Signal Theory and Communications Deparment, Escuela Politécnica Superior, Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911 Leganés, Spain
Venue:
Pattern Recognition
Year:
2011

Citing 14
Cited 0

Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Clustering by Dynamics

Machine Learning - Special issue: Unsupervised learning
A Hidden Markov Model-Based Approach to Sequential Data Clustering

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Using Dynamic Time Warping to Bootstrap HMM-Based Clustering of Time Series

Sequence Learning - Paradigms, Algorithms, and Applications
Integrating Hidden Markov Models and Spectral Analysis for Sensory Time Series Clustering

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
A tutorial on spectral clustering

Statistics and Computing
Spectral Clustering and Embedding with Hidden Markov Models

ECML '07 Proceedings of the 18th European conference on Machine Learning
Regularization on Graphs with Function-adapted Diffusion Processes

The Journal of Machine Learning Research
A New Distance Measure for Model-Based Sequence Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sequence Segmentation via Clustering of Subsequences

ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
Clustering of time series data-a survey

Pattern Recognition
Discovering clusters in motion time-series data

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper proposes a novel similarity measure for clustering sequential data. We first construct a common state space by training a single probabilistic model with all the sequences in order to get a unified representation for the dataset. Then, distances are obtained attending to the transition matrices induced by each sequence in that state space. This approach solves some of the usual overfitting and scalability issues of the existing semi-parametric techniques that rely on training a model for each sequence. Empirical studies on both synthetic and real-world datasets illustrate the advantages of the proposed similarity measure for clustering sequences.