Mutual Information Theory for Adaptive Mixture Models

  • Authors:
  • Zheng Rong Yang;Mark Zwolinski

  • Affiliations:
  • Univ. of Exeter, Exeter, U.K.;Univ. of Southampton, Southampton, U.K.

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 2001

Quantified Score

Hi-index 0.15

Visualization

Abstract

Many pattern recognition systems need to estimate an underlying probability density function (pdf). Mixture models are commonly used for this purpose in which an underlying pdf is estimated by a finite mixing of distributions. The basic computational element of a density mixture model is a component with a nonlinear mapping function, which takes part in mixing. Selecting an optimal set of components for mixture models is important to ensure an efficient and accurate estimate of an underlying pdf. Previous work has commonly estimated an underlying pdf based on the information contained in patterns. In this paper, mutual information theory is employed to measure whether two components are statistically dependent. If a component has small mutual information, it is statistically independent of the other components. Hence, that component makes a significant contribution to the system pdf and should not be removed. However, if a particular component has large mutual information, it is unlikely to be statistically independent of the other components and may be removed without significant damage to the estimated pdf. Continuing to remove components with large and positive mutual information will give a density mixture model with an optimal structure, which is very close to the true pdf.