On fast supervised learning for normal mixture models with missing information

  • Authors:
  • Tsung I. Lin;Jack C. Lee;Hsiu J. Ho

  • Affiliations:
  • Department of Applied Mathematics, National Chung Hsing University, Taichung 402, Taiwan;Institute of Statistics and Graduate Institute of Finance, National Chiao Tung University, Hsinchu 300, Taiwan;Institute of Statistical Science, Academia Sinica, Taipei, Taiwan

  • Venue:
  • Pattern Recognition
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

It is an important research issue to deal with mixture models when missing values occur in the data. In this paper, computational strategies using auxiliary indicator matrices are introduced for efficiently handling mixtures of multivariate normal distributions when the data are missing at random and have an arbitrary missing data pattern, meaning that missing data can occur anywhere. We develop a novel EM algorithm that can dramatically save computation time and be exploited in many applications, such as density estimation, supervised clustering and prediction of missing values. In the aspect of multiple imputations for missing data, we also offer a data augmentation scheme using the Gibbs sampler. Our proposed methodologies are illustrated through some real data sets with varying proportions of missing values.