Analytic solution of hierarchical variational bayes in linear inverse problem

Authors:
Shinichi Nakajima;Sumio Watanabe
Affiliations:
Nikon Corporation, Kumagaya, Japan;Tokyo Institute of Technology, Yokohama, Japan
Venue:
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Year:
2006

Citing 4
Cited 1

Keeping the neural networks simple by minimizing the description length of the weights

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Generalization Performance of Subspace Bayes Approach in Linear Neural Networks

IEICE - Transactions on Information and Systems
Inferring parameters and structure of latent variable models by variational bayes

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Generalization error of automatic relevance determination

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In singular models, the Bayes estimation, commonly, has the advantage of the generalization performance over the maximum likelihood estimation, however, its accurate approximation using Markov chain Monte Carlo methods requires huge computational costs. The variational Bayes (VB) approach, a tractable alternative, has recently shown good performance in the automatic relevance determination model (ARD), a kind of hierarchical Bayesian learning, in brain current estimation from magnetoencephalography (MEG) data, an ill-posed linear inverse problem. On the other hand, it has been proved that, in three-layer linear neural networks (LNNs), the VB approach is asymptotically equivalent to a positive-part James-Stein type shrinkage estimation. In this paper, noting the similarity between the ARD in a linear problem and an LNN, we analyze a simplified version of the VB approach in the ARD. We discuss its relation to the shrinkage estimation and how ill-posedness affects learning. We also propose the algorithm that requires simpler computation than, and will provide similar performance to, the VB approach.