Missing data imputation through GTM as a mixture of t-distributions

Authors:
Alfredo Vellido
Affiliations:
Department of Computing Languages and Systems (LSI), Polytechnic University of Catalonia (UPC), C. Jordi Girona, 1-3. 08034, Barcelona, Spain
Venue:
Neural Networks
Year:
2006

Citing 20
Cited 14

Statistical analysis with missing data

Statistical analysis with missing data
GTM: the generative topographic mapping

Neural Computation
Deterministic annealing EM algorithm

Neural Networks
Mixtures of probabilistic principal component analyzers

Neural Computation
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Hierarchical GTM: Constructing Localized Nonlinear Projection Manifolds in a Principled Way

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust mixture modelling using the t distribution

Statistics and Computing
Bayesian model search for mixture models based on optimizing variational bounds

Neural Networks
Editorial: recent developments in mixture models

Computational Statistics & Data Analysis
Finite mixture regression model with random effects: application to neonatal hospital length of stay

Computational Statistics & Data Analysis
Robust Cluster Analysis via Mixtures of Multivariate t-Distributions

SSPR '98/SPR '98 Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Learning from Incomplete Data

Learning from Incomplete Data
Robust mixture modelling using multivariate t-distribution with missing information

Pattern Recognition Letters
Outlier Detection and Data Cleaning in Multivariate Non-Normal Samples: The PAELLA Algorithm

Data Mining and Knowledge Discovery
Outlier detection in scatterometer data: neural network approaches

Neural Networks - 2003 Special issue: Neural network analysis of complex scientific data: Astronomy and geosciences
High breakdown mixture discriminant analysis

Journal of Multivariate Analysis
SMEM Algorithm for Mixture Models

Neural Computation
Robust Bayesian mixture modelling

Neurocomputing
Robust analysis of MRS brain tumour data using t-GTM

Neurocomputing
Selective smoothing of the generative topographic mapping

IEEE Transactions on Neural Networks

Advances in clustering and visualization of time series using GTM through time

Neural Networks
On the Initialization of Two-Stage Clustering with Class-GTM

Current Topics in Artificial Intelligence
Unfolding the Manifold in Generative Topographic Mapping

HAIS '08 Proceedings of the 3rd international workshop on Hybrid Artificial Intelligence Systems
Geodesic Generative Topographic Mapping

IBERAMIA '08 Proceedings of the 11th Ibero-American conference on AI: Advances in Artificial Intelligence
On the Improvement of the Mapping Trustworthiness and Continuity of a Manifold Learning Model

IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
On the improvement of brain tumour data clustering using class information

Proceedings of the 2006 conference on STAIRS 2006: Proceedings of the Third Starting AI Researchers' Symposium
Multivariate Student-t self-organizing maps

Neural Networks
On EM Estimation for Mixture of Multivariate t-Distributions

Neural Processing Letters
Neural networks and other machine learning methods in cancer research

IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
On the influence of class information in the two-stage clustering of a human brain tumour dataset

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Probabilistic self-organizing maps for qualitative data

Neural Networks
Probabilistic self-organizing maps for continuous data

IEEE Transactions on Neural Networks
Optimum estimation of missing values in randomized complete block design by genetic algorithm

Knowledge-Based Systems
Cartogram visualization for nonlinear manifold learning models

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Generative Topographic Mapping (GTM) was originally conceived as a probabilistic alternative to the well-known, neural network-inspired, Self-Organizing Maps. The GTM can also be interpreted as a constrained mixture of distribution models. In recent years, much attention has been directed towards Student t-distributions as an alternative to Gaussians in mixture models due to their robustness towards outliers. In this paper, the GTM is redefined as a constrained mixture of t-distributions: the t-GTM, and the Expectation-Maximization algorithm that is used to fit the model to the data is modified to carry out missing data imputation. Several experiments show that the t-GTM successfully detects outliers, while minimizing their impact on the estimation of the model parameters. It is also shown that the t-GTM provides an overall more accurate imputation of missing values than the standard Gaussian GTM.