Generalization error of linear neural networks in an empirical bayes approach

Authors:
Shinichi Nakajima;Sumio Watanabe
Affiliations:
Tokyo Institute of Technology, Yokohama, Kanagawa, Japan and Nikon Corporation, Kumagaya, Saitama, Japan;Tokyo Institute of Technology, Yokohama, Kanagawa, Japan
Venue:
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Year:
2005

Citing 6
Cited 1

Keeping the neural networks simple by minimizing the description length of the weights

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Learning coefficients of layered models when the true distribution mismatches the singularities

Neural Computation
Singularities in mixture models and upper bounds of stochastic complexity

Neural Networks
Generalization Performance of Subspace Bayes Approach in Linear Neural Networks

IEICE - Transactions on Information and Systems
Inferring parameters and structure of latent variable models by variational bayes

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning in linear neural networks: a survey

IEEE Transactions on Neural Networks

Variational Bayes Solution of Linear Neural Networks and Its Generalization Performance

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is well known that in unidentifiable models, the Bayes estimation has the advantage of generalization performance to the maximum likelihood estimation. However, accurate approximation of the posterior distribution requires huge computational costs. In this paper, we consider an empirical Bayes approach where a part of the parameters are regarded as hyperparameters, which we call a subspace Bayes approach, and theoretically analyze the generalization error of three-layer linear neural networks. We show that a subspace Bayes approach is asymptotically equivalent to a positivepart James-Stein type shrinkage estimation, and behaves similarly to the Bayes estimation in typical cases.