No free lunch for early stopping
Neural Computation
Parameter convergence and learning curves for neural networks
Neural Computation
Model Generation of Neural Network Ensembles Using Two-Level Cross-Validation
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
How to Automate Neural Net Based Learning
MLDM '01 Proceedings of the Second International Workshop on Machine Learning and Data Mining in Pattern Recognition
No Free Lunch for Noise Prediction
Neural Computation
Environmental Modelling & Software
Optimizing number of hidden neurons in neural networks
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Letters: Prediction error of a fault tolerant neural network
Neurocomputing
Fuzzy model validation using the local statistical approach
Fuzzy Sets and Systems
Use of neurofuzzy networks to improve wastewater flow-rate forecasting
Environmental Modelling & Software
IEEE Transactions on Neural Networks
Environmental Modelling & Software
A fuzzy approach to image analysis in HLA typing using oligonucleotide microarrays
Fuzzy Sets and Systems
Why Does Unsupervised Pre-training Help Deep Learning?
The Journal of Machine Learning Research
Expert Systems with Applications: An International Journal
On the selection of weight decay parameter for faulty networks
IEEE Transactions on Neural Networks
Some issues about the generalization of neural networks for time series prediction
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
An ensemble of degraded neural networks
MCPR'11 Proceedings of the Third Mexican conference on Pattern recognition
ICA and committee machine-based algorithm for cursor control in a BCI system
ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part I
Expert Systems with Applications: An International Journal
Mathematical and Computer Modelling: An International Journal
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence
Sample size determination for logistic regression
Journal of Computational and Applied Mathematics
Hi-index | 0.01 |
A statistical theory for overtraining is proposed. The analysis treats general realizable stochastic neural networks, trained with Kullback-Leibler divergence in the asymptotic case of a large number of training examples. It is shown that the asymptotic gain in the generalization error is small if we perform early stopping, even if we have access to the optimal stopping time. Based on the cross-validation stopping we consider the ratio the examples should be divided into training and cross-validation sets in order to obtain the optimum performance. Although cross-validated early stopping is useless in the asymptotic region, it surely decreases the generalization error in the nonasymptotic region. Our large scale simulations done on a CM5 are in good agreement with our analytical findings