Estimating the number of components in a finite mixture model: the special case of homogeneity

  • Authors:
  • Peter Schlattmann

  • Affiliations:
  • Department of Psychiatry and Psychotherapy, Freie Universität Berlin, Clinical Psychophysiology, Eschenallee 3, D-14050 Berlin, Germany

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2003

Quantified Score

Hi-index 0.03

Visualization

Abstract

Finite mixture models arise in a natural way in that they are modeling unobserved population heterogeneity. An application in disease mapping shows that mixture models are useful in separating signal from noise. Thus, the number of components k of the mixture model needs to be estimated where k = 1 is the important homogenous case. Because of the irregularity of the parameter space, the log-likelihood-ratio statistic (LRS) does not have a χ2 limit distribution and therefore it is difficult to use the LRS to test for the number of components. An alternative approach applies the nonparametric bootstrap such that a mixture algorithm is applied B times to bootstrap samples obtained from the original sample with replacement. The number of components k is obtained as the mode of the bootstrap distribution of k. This approach provides on empirical grounds a mode-unbiased and consistent estimator for the number of components in the homogeneous Poisson case. The distribution of the log-likelihood-ratio statistic (LRS) for the testing problem H0 : k = 1 vs. H1 : k 1 is addressed for the Poisson case. For a very large sample size of n = 10000 this distribution approximates a χ12 distribution.