Statistical theory of learning curves under entropic loss criterion

  • Authors:
  • Shun-Ichi Amari;Noboru Murata

  • Affiliations:
  • -;-

  • Venue:
  • Neural Computation
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present paper elucidates a universal property of learningcurves, which shows how the generalization error, training error,and the complexity of the underlying stochastic machine are relatedand how the behavior of a stochastic machine is improved as thenumber of training examples increases. The error is measured by theentropic loss. It is proved that the generalization error convergesto H0, the entropy of the conditionaldistribution of the true machine, as H0 +m*/(2t), while the training error converges asH0-m*/(2t), where t is thenumber of examples and m* shows the complexity ofthe network. When the model is faithful, implying that the truemachine is in the model, m* is reduced tom, the number of modifiable parameters. This is a universallaw because it holds for any regular machine irrespective of itsstructure under the maximum likelihood estimator. Similar relationsare obtained for the Bayes and Gibbs learning algorithms. Theselearning curves show the relation among the accuracy of learning,the complexity of a model, and the number of training examples.