Similarity-based clustering of sequences using hidden Markov models

  • Authors:
  • Manuele Bicego;Vittorio Murino;Mário A. T. Figueiredo

  • Affiliations:
  • Dipartimento di Informatica, Università di Verona, Verona, Italy;Dipartimento di Informatica, Università di Verona, Verona, Italy;Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal

  • Venue:
  • MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hidden Markov models constitute a widely employed tool for sequential data modelling; nevertheless, their use in the clustering context has been poorly investigated. In this paper a novel scheme for HMM-based sequential data clustering is proposed, inspired on the similarity-based paradigm recently introduced in the supervised learning context. With this approach, a new representation space is built, in which each object is described by the vector of its similarities with respect to a predeterminate set of other objects. These similarities are determined using hidden Markov models. Clustering is then performed in such a space. By way of this, the difficult problem of clustering of sequences is thus transposed to a more manageable format, the clustering of points (vectors of features). Experimental evaluation on synthetic and real data shows that the proposed approach largely outperforms standard HMM clustering schemes.