Original Contribution: Improving model selection by nonconvergent methods

Authors:
William Finnoff;Ferdinand Hergert;Hans Georg Zimmermann
Affiliations:
-;-;-
Venue:
Neural Networks
Year:
1993

Citing 8
Cited 18

Comparing biases for minimal network construction with back-propagation

Advances in neural information processing systems 1
The cascade-correlation learning architecture

Advances in neural information processing systems 2
Optimal brain damage

Advances in neural information processing systems 2
Generalization and parameter estimation in feedforward nets: some experiments

Advances in neural information processing systems 2
Note on learning rate schedules for stochastic optimization

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Generalization by weight-elimination with application to forecasting

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Diffusion Approximations for the Constant Step Size Backpropagation Algorithm and Resistance to Local Minima

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Temporal evolution of generalization during learning in linear networks

Neural Computation

Extended Kalman filter-based pruning method for recurrent neural networks

Neural Computation
An adaptive Bayesian pruning for neural networks in a non-stationary environment

Neural Computation
Exploiting Domain-Specific Properties: Compiling Parallel Dynamic Neural Network Algorithms into Efficient Code

IEEE Transactions on Parallel and Distributed Systems
Multi-objective cooperative coevolution of artificial neural networks (multi-objective cooperative networks)

Neural Networks
How to Automate Neural Net Based Learning

MLDM '01 Proceedings of the Second International Workshop on Machine Learning and Data Mining in Pattern Recognition
Randomized Variable Elimination

The Journal of Machine Learning Research
Classifier learning with a new locality regularization method

Pattern Recognition
Classifier learning with a new locality regularization method

Pattern Recognition
Training neural networks for classification using growth probability-based evolution

Neurocomputing
Enhancing the generalization ability of neural networks through controlling the hidden layers

Applied Soft Computing
Neural network determination of cloud attenuation to estimate insolation using MTSAT-1R data

International Journal of Remote Sensing - Satellite observations of the atmosphere, ocean and their interface in relation to climate, natural hazards and management of the coastal zone
Artificial neural network reduction through oracle learning

Intelligent Data Analysis
Evaluation of robustness and performance of early stopping rules with multi layer perceptrons

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Improved estimation of surface solar insolation using a neural network and MTSAT-1R data

Computers & Geosciences
An adaptive network topology for classification

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
A novel approach for simplifying neural networks by identifying decoupling inputs

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Where should we stop? an investigation on early stopping for GP learning

SEAL'12 Proceedings of the 9th international conference on Simulated Evolution and Learning
Software effort estimation as a multiobjective learning problem

ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many techniques for model selection in the field of neural networks correspond to well established statistical methods. For example, architecture modifications based on test variables calculated after convergence of the training process can be viewed as part of a hypothesis testing search, and the use of complexity penalty terms is essentially a type of regularization or biased regression. The method of ''stopped'' or ''cross-validation'' training, on the other hand, in which an oversized network is trained until the error on a further validation set of examples deteriorates, then training is stopped, is a true innovation since model selection doesn't require convergence of the training process. Here, the training process is used to perform a directed search of the parameter space for a model which doesn't overfit the data and thus demonstrates superior generalization performance. In this paper we show that this performance can be significantly enhanced by expanding the ''nonconvergent method'' of stopped training to include dynamic topology modifications (dynamic weight pruning) and modified complexity penalty term methods in which the weighting of the penalty term is adjusted during the training process. On an extensive sequence of simulation examples we demonstrate the general superiority of the ''extended'' nonconvergent methods compared to classical penalty term methods, simple stopped training, and methods which only vary the number of hidden units.