Two design methods of hyperparameters in variational Bayes learning for Bernoulli mixtures

Authors:
Daisuke Kaji;Sumio Watanabe
Affiliations:
Computational Intelligence and System Science, Tokyo Institute of Technology, Mailbox R2-5, 4259 Nagatsuda, Midori-ku, Yokohama 226-8503, Japan and Konicaminolta Medical & Graphic, INC., 2970 Ishi ...;Precision and Intelligence Laboratory, Tokyo Institute of Technology, 4529 Nagatsuda, Midori-ku, Yokohama 226-8503, Japan
Venue:
Neurocomputing
Year:
2011

Citing 10
Cited 0

Keeping the neural networks simple by minimizing the description length of the weights

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Online Model Selection Based on the Variational Bayes

Neural Computation
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Variational Bayes Solution of Linear Neural Networks and Its Generalization Performance

Neural Computation
Stochastic Complexities of Gaussian Mixtures in Variational Bayesian Approximation

The Journal of Machine Learning Research
Programming collective intelligence

Programming collective intelligence
Algebraic Geometry and Statistical Learning Theory

Algebraic Geometry and Statistical Learning Theory
Equations of states in singular statistical estimation

Neural Networks
Inferring parameters and structure of latent variable models by variational bayes

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

Variational Bayes learning or mean field approximation is widely used in statistical models which are made of mixtures of exponential distributions, for example, normal mixtures, binomial mixtures, and hidden Markov models. To derive variational Bayes learning algorithm, we need to determine the hyperparameters in the a priori distribution; however, the design method of hyperparameters has not yet been established. In the present paper, we propose two different design methods of hyperparameters which are applied to the different purposes. In the former method, the hyperparameter is determined for minimization of the generalization error. In the latter method, it is chosen so that candidates of hidden structure in training data are extracted. It is experimentally shown that the optimal hyperparameters for two purposes are different from each other.