Missing data imputation through GTM as a mixture of t-distributions

  • Authors:
  • Alfredo Vellido

  • Affiliations:
  • Department of Computing Languages and Systems (LSI), Polytechnic University of Catalonia (UPC), C. Jordi Girona, 1-3. 08034, Barcelona, Spain

  • Venue:
  • Neural Networks
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Generative Topographic Mapping (GTM) was originally conceived as a probabilistic alternative to the well-known, neural network-inspired, Self-Organizing Maps. The GTM can also be interpreted as a constrained mixture of distribution models. In recent years, much attention has been directed towards Student t-distributions as an alternative to Gaussians in mixture models due to their robustness towards outliers. In this paper, the GTM is redefined as a constrained mixture of t-distributions: the t-GTM, and the Expectation-Maximization algorithm that is used to fit the model to the data is modified to carry out missing data imputation. Several experiments show that the t-GTM successfully detects outliers, while minimizing their impact on the estimation of the model parameters. It is also shown that the t-GTM provides an overall more accurate imputation of missing values than the standard Gaussian GTM.