What size net gives valid generalization?
Neural Computation
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Rigorous learning curve bounds from statistical mechanics
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Statistical theory of learning curves under entropic loss criterion
Neural Computation
An experimental and theoretical comparison of model selection methods
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Learning Curves, Model Selection and Complexity of Neural Networks
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Asymptotic properties of the Fisher kernel
Neural Computation
Hi-index | 0.01 |
The universal asymptotic scaling laws proposed by Amari et al. are studied in large scale simulations using a CM5. Small stochastic multilayer feedforward networks trained with backpropagation are investigated. In the range of a large number of training patterns t, the asymptotic generalization error scales as 1/t as predicted. For a medium range t a faster 1/t2 scaling is observed. This effect is explained by using higher order corrections of the likelihood expansion. It is shown for small t that the scaling law changes drastically, when the network undergoes a transition from strong overfitting to effective learning.