Asymptotic statistical theory of overtraining and cross-validation

Authors:
S. Amari;N. Murata;K. -R. Muller;M. Finke;H. H. Yang
Affiliations:
RIKEN, Inst. of Phys. & Chem. Res., Saitama;-;-;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
1997

Citing 0
Cited 31

No free lunch for early stopping

Neural Computation
Parameter convergence and learning curves for neural networks

Neural Computation
Theoretical and Experimental Evaluation of the Subspace Information Criterion

Machine Learning
Model Generation of Neural Network Ensembles Using Two-Level Cross-Validation

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
How to Automate Neural Net Based Learning

MLDM '01 Proceedings of the Second International Workshop on Machine Learning and Data Mining in Pattern Recognition
No Free Lunch for Noise Prediction

Neural Computation
Air pollutant emissions prediction by process modelling - Application in the iron and steel industry in the case of a re-heating furnace

Environmental Modelling & Software
Optimizing number of hidden neurons in neural networks

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
A systematic comparison of flat and standard cascade-correlation using a student-teacher network approximation task

Connection Science
An efficient immune-based symbiotic particle swarm optimization learning algorithm for TSK-type neuro-fuzzy networks design

Fuzzy Sets and Systems
Letters: Prediction error of a fault tolerant neural network

Neurocomputing
Fuzzy model validation using the local statistical approach

Fuzzy Sets and Systems
Use of neurofuzzy networks to improve wastewater flow-rate forecasting

Environmental Modelling & Software
Division-based rainfall-runoff simulations with BP neural networks and Xinanjiang model

Neurocomputing
On objective function, regularizer, and prediction error of a learning algorithm for dealing with multiplicative weight noise

IEEE Transactions on Neural Networks
Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area

Environmental Modelling & Software
A fuzzy approach to image analysis in HLA typing using oligonucleotide microarrays

Fuzzy Sets and Systems
Why Does Unsupervised Pre-training Help Deep Learning?

The Journal of Machine Learning Research
Evaluation of global and local training techniques over feed-forward neural network architecture spaces for computer-aided medical diagnosis

Expert Systems with Applications: An International Journal
On the selection of weight decay parameter for faulty networks

IEEE Transactions on Neural Networks
GPS/INS integration utilizing dynamic neural networks for vehicular navigation

Information Fusion
Some issues about the generalization of neural networks for time series prediction

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Regularizers for fault tolerant multilayer feedforward networks

Neurocomputing
An ensemble of degraded neural networks

MCPR'11 Proceedings of the Third Mexican conference on Pattern recognition
ICA and committee machine-based algorithm for cursor control in a BCI system

ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part I
Eliciting a human understandable model of ice adhesion strength for rotor blade leading edge materials from uncertain experimental data

Expert Systems with Applications: An International Journal
A probabilistic method for assisting knowledge extraction from artificial neural networks used for hydrological prediction

Mathematical and Computer Modelling: An International Journal
Forecasting of short-term traffic-flow based on improved neurofuzzy models via emotional temporal difference learning algorithm

Engineering Applications of Artificial Intelligence
Optimization of tuning parameters for open node fault regularizer

Neurocomputing
Artificial neural network simulation of hourly groundwater levels in a coastal aquifer system of the Venice lagoon

Engineering Applications of Artificial Intelligence
Sample size determination for logistic regression

Journal of Computational and Applied Mathematics

Quantified Score

Hi-index	0.01

Visualization

Abstract

A statistical theory for overtraining is proposed. The analysis treats general realizable stochastic neural networks, trained with Kullback-Leibler divergence in the asymptotic case of a large number of training examples. It is shown that the asymptotic gain in the generalization error is small if we perform early stopping, even if we have access to the optimal stopping time. Based on the cross-validation stopping we consider the ratio the examples should be divided into training and cross-validation sets in order to obtain the optimum performance. Although cross-validated early stopping is useless in the asymptotic region, it surely decreases the generalization error in the nonasymptotic region. Our large scale simulations done on a CM5 are in good agreement with our analytical findings