Comparing biases for minimal network construction with back-propagation
Advances in neural information processing systems 1
The cascade-correlation learning architecture
Advances in neural information processing systems 2
Advances in neural information processing systems 2
Generalization and parameter estimation in feedforward nets: some experiments
Advances in neural information processing systems 2
Note on learning rate schedules for stochastic optimization
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Generalization by weight-elimination with application to forecasting
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Temporal evolution of generalization during learning in linear networks
Neural Computation
Extended Kalman filter-based pruning method for recurrent neural networks
Neural Computation
IEEE Transactions on Parallel and Distributed Systems
How to Automate Neural Net Based Learning
MLDM '01 Proceedings of the Second International Workshop on Machine Learning and Data Mining in Pattern Recognition
Randomized Variable Elimination
The Journal of Machine Learning Research
Classifier learning with a new locality regularization method
Pattern Recognition
Classifier learning with a new locality regularization method
Pattern Recognition
Neural network determination of cloud attenuation to estimate insolation using MTSAT-1R data
International Journal of Remote Sensing - Satellite observations of the atmosphere, ocean and their interface in relation to climate, natural hazards and management of the coastal zone
Artificial neural network reduction through oracle learning
Intelligent Data Analysis
Evaluation of robustness and performance of early stopping rules with multi layer perceptrons
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Improved estimation of surface solar insolation using a neural network and MTSAT-1R data
Computers & Geosciences
An adaptive network topology for classification
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
A novel approach for simplifying neural networks by identifying decoupling inputs
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Where should we stop? an investigation on early stopping for GP learning
SEAL'12 Proceedings of the 9th international conference on Simulated Evolution and Learning
Software effort estimation as a multiobjective learning problem
ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Hi-index | 0.00 |
Many techniques for model selection in the field of neural networks correspond to well established statistical methods. For example, architecture modifications based on test variables calculated after convergence of the training process can be viewed as part of a hypothesis testing search, and the use of complexity penalty terms is essentially a type of regularization or biased regression. The method of ''stopped'' or ''cross-validation'' training, on the other hand, in which an oversized network is trained until the error on a further validation set of examples deteriorates, then training is stopped, is a true innovation since model selection doesn't require convergence of the training process. Here, the training process is used to perform a directed search of the parameter space for a model which doesn't overfit the data and thus demonstrates superior generalization performance. In this paper we show that this performance can be significantly enhanced by expanding the ''nonconvergent method'' of stopped training to include dynamic topology modifications (dynamic weight pruning) and modified complexity penalty term methods in which the weighting of the penalty term is adjusted during the training process. On an extensive sequence of simulation examples we demonstrate the general superiority of the ''extended'' nonconvergent methods compared to classical penalty term methods, simple stopped training, and methods which only vary the number of hidden units.