Unfolding the Manifold in Generative Topographic Mapping

  • Authors:
  • Raúl Cruz-Barbosa;Alfredo Vellido

  • Affiliations:
  • Universitat Politècnica de Catalunya, Jordi Girona, Barcelona, Spain 08034 and Universidad Tecnológica de la Mixteca, Huajuapan, Oaxaca, México 69000;Universitat Politècnica de Catalunya, Jordi Girona, Barcelona, Spain 08034

  • Venue:
  • HAIS '08 Proceedings of the 3rd international workshop on Hybrid Artificial Intelligence Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Generative Topographic Mapping (GTM) is a probabilistic latent variable model for multivariate data clustering and visualization. It tries to capture the relevant data structure by defining a low-dimensional manifold embedded in the high-dimensional data space. This requires the assumption that the data can be faithfully represented by a manifold of much lower dimension than that of the observed space. Even when this assumption holds, the approximation of the data may, for some datasets, require plenty of folding, resulting in an entangled manifold and in breaches of topology preservation that would hamper data visualization and cluster definition. This can be partially avoided by modifying the GTM learning procedure so as to penalize divergences between the Euclidean distances from the data to the model prototypes and the corresponding geodesic distances along the manifold. We define and assess this strategy, comparing it to the performance of the standard GTM, using several artificial datasets.