Statistical theory of learning curves under entropic loss criterion

Authors:
Shun-Ichi Amari;Noboru Murata
Affiliations:
-;-
Venue:
Neural Computation
Year:
1993

Citing 0
Cited 20

General bounds on the mutual information between a parameter and n conditionally independent observations

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Flat minima

Neural Computation
Natural gradient works efficiently in learning

Neural Computation
Parameter convergence and learning curves for neural networks

Neural Computation
Algebraic geometrical methods for hierarchical learning machines

Neural Networks
Estimates of average complexity of neurocontrol algorithms

Neural Networks
On Density Estimation under Relative Entropy Loss Criterion

Problems of Information Transmission
Entropic Measures with Radial Basis Units

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Generalization Error and Training Error at Singularities of Multilayer Perceptrons

IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I
Asymptotic properties of the Fisher kernel

Neural Computation
Results in statistical discriminant analysis: a review of the former Soviet union literature

Journal of Multivariate Analysis
An asymptotic statistical theory of polynomial kernel methods

Neural Computation
An asymptotic statistical analysis of support vector machines with soft margins

Neural Networks
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
Singularities Affect Dynamics of Learning in Neuromanifolds

Neural Computation
Relation between weight size and degree of over-fitting in neural network regression

Neural Networks
A numerical study on learning curves in stochastic multilayer feedforward networks

Neural Computation
Generalization error analysis for polynomial kernel methods: algebraic geometrical approach

ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Stochastic Complexity and Generalization Error of a Restricted Boltzmann Machine in Bayesian Estimation

The Journal of Machine Learning Research
Learning coefficient of generalization error in bayesian estimation and vandermonde matrix-type singularity

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present paper elucidates a universal property of learningcurves, which shows how the generalization error, training error,and the complexity of the underlying stochastic machine are relatedand how the behavior of a stochastic machine is improved as thenumber of training examples increases. The error is measured by theentropic loss. It is proved that the generalization error convergesto H0, the entropy of the conditionaldistribution of the true machine, as H0 +m*/(2t), while the training error converges asH0-m*/(2t), where t is thenumber of examples and m* shows the complexity ofthe network. When the model is faithful, implying that the truemachine is in the model, m* is reduced tom, the number of modifiable parameters. This is a universallaw because it holds for any regular machine irrespective of itsstructure under the maximum likelihood estimator. Similar relationsare obtained for the Bayes and Gibbs learning algorithms. Theselearning curves show the relation among the accuracy of learning,the complexity of a model, and the number of training examples.