Linear Time Model Selection for Mixture of Heterogeneous Components

  • Authors:
  • Ryohei Fujimaki;Satoshi Morinaga;Michinari Momma;Kenji Aoki;Takayuki Nakata

  • Affiliations:
  • NEC Common Platform Software Research Laboratories,;NEC Common Platform Software Research Laboratories,;NEC Common Platform Software Research Laboratories,;NEC Common Platform Software Research Laboratories,;NEC Common Platform Software Research Laboratories,

  • Venue:
  • ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Our main contribution is to propose a novel model selection methodology, expectation minimization of description length (EMDL), based on the minimum description length (MDL) principle. EMDL makes a significant impact on the combinatorial scalability issue pertaining to the model selection for mixture models having types of components. A goal of such problems is to optimize types of components as well as the number of components. One key idea in EMDL is to iterate calculations of the posterior of latent variables and minimization of expected description length of both observed data and latent variables. This enables EMDL to compute the optimal model in linear time with respect to both the number of components and the number of available types of components despite the fact that the number of model candidates exponentially increases with the numbers. We prove that EMDL is compliant with the MDL principle and enjoys its statistical benefits.