Elements of information theory
Elements of information theory
A Learning Criterion for Stochastic Rules
Machine Learning - Computational learning theory
A Hierarchical Latent Variable Model for Data Visualization
IEEE Transactions on Pattern Analysis and Machine Intelligence
MML clustering of multi-state, Poisson, vonMises circular and Gaussian distributions
Statistics and Computing
Single Factor Analysis in MML Mixture Modelling
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
MDL-Based Selection of the Number of Components in Mixture Models for Pattern Classification
SSPR '98/SPR '98 Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Advances in Minimum Description Length: Theory and Applications (Neural Information Processing)
Advances in Minimum Description Length: Theory and Applications (Neural Information Processing)
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression
IEEE Transactions on Pattern Analysis and Machine Intelligence
Online heterogeneous mixture modeling with marginal and copula selection
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Our main contribution is to propose a novel model selection methodology, expectation minimization of description length (EMDL), based on the minimum description length (MDL) principle. EMDL makes a significant impact on the combinatorial scalability issue pertaining to the model selection for mixture models having types of components. A goal of such problems is to optimize types of components as well as the number of components. One key idea in EMDL is to iterate calculations of the posterior of latent variables and minimization of expected description length of both observed data and latent variables. This enables EMDL to compute the optimal model in linear time with respect to both the number of components and the number of available types of components despite the fact that the number of model candidates exponentially increases with the numbers. We prove that EMDL is compliant with the MDL principle and enjoys its statistical benefits.