Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory

Authors:
Sumio Watanabe
Affiliations:
-
Venue:
The Journal of Machine Learning Research
Year:
2010

Citing 15
Cited 2

Model selection

Model selection
A universal theorem on learning curves

Neural Networks
Cross-validation methods

Journal of Mathematical Psychology
Algebraic geometrical methods for hierarchical learning machines

Neural Networks
On the problem in model selection of neural network regression in overrealizable scenario

Neural Computation
Bayesian model assessment and comparison using cross-validation predictive densities

Neural Computation
Singularities in mixture models and upper bounds of stochastic complexity

Neural Networks
Asymptotic Model Selection for Naive Bayesian Networks

The Journal of Machine Learning Research
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
Stochastic complexities of reduced rank regression in Bayesian estimation

Neural Networks
Asymptotic behavior of exchange ratio in exchange Monte Carlo method

Neural Networks
Algebraic Geometry and Statistical Learning Theory

Algebraic Geometry and Statistical Learning Theory
Equations of states in singular statistical estimation

Neural Networks
Algebraic geometry and stochastic complexity of hidden Markov models

Neurocomputing
Stochastic Complexity and Generalization Error of a Restricted Boltzmann Machine in Bayesian Estimation

The Journal of Machine Learning Research

A widely applicable Bayesian information criterion

The Journal of Machine Learning Research
Effective connectivity analysis of fMRI data based on network motifs

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In regular statistical models, the leave-one-out cross-validation is asymptotically equivalent to the Akaike information criterion. However, since many learning machines are singular statistical models, the asymptotic behavior of the cross-validation remains unknown. In previous studies, we established the singular learning theory and proposed a widely applicable information criterion, the expectation value of which is asymptotically equal to the average Bayes generalization loss. In the present paper, we theoretically compare the Bayes cross-validation loss and the widely applicable information criterion and prove two theorems. First, the Bayes cross-validation loss is asymptotically equivalent to the widely applicable information criterion as a random variable. Therefore, model selection and hyperparameter optimization using these two values are asymptotically equivalent. Second, the sum of the Bayes generalization error and the Bayes cross-validation error is asymptotically equal to 2λ/n, where λ is the real log canonical threshold and n is the number of training samples. Therefore the relation between the cross-validation error and the generalization error is determined by the algebraic geometrical structure of a learning machine. We also clarify that the deviance information criteria are different from the Bayes cross-validation and the widely applicable information criterion.