Maximum Likelihood Estimation of Mixture Densities for Binned and Truncated Multivariate Data

  • Authors:
  • Igor V. Cadez;Padhraic Smyth;Geoff J. McLachlan;Christine E. McLaren

  • Affiliations:
  • Department of Information and Computer Science, University of California, Irvine, CA 92697, USA. icadez@ics.uci.edu;Department of Information and Computer Science, University of California, Irvine, CA 92697, USA. smyth@ics.uci.edu;Department of Mathematics, The University of Queensland, Brisbane, Australia;Division of Epidemiology, Department of Medicine, University of California, Irvine, CA 92697, USA

  • Venue:
  • Machine Learning - Special issue: Unsupervised learning
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571–578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.