Multilayer feedforward networks are universal approximators
Neural Networks
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Nonlinear Versus Linear Learning Devices: A Procedural Perspective
Computational Economics
Globally Convergent Modification of the Quickprop Method
Neural Processing Letters
A framework for the development of globally convergent adaptive learning rate algorithms
Progress in computer research
Improving Training in the Vicinity of Temporary Minima
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
The multi-phase method in fast learning algorithms
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Analysis of artificial neural network learning near temporary minima: A fuzzy logic approach
Fuzzy Sets and Systems
Hidden node pruning of multilayer perceptrons based on redundancy reduction
ICHIT'11 Proceedings of the 5th international conference on Convergence and hybrid information technology
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Let a biogeography-based optimizer train your Multi-Layer Perceptron
Information Sciences: an International Journal
Hi-index | 0.01 |
The back propagation (BP) algorithm is widely used for finding optimum weights of multilayer neural networks in many pattern recognition applications. However, the critical drawbacks of the algorithm are its slow learning speed and convergence to local minima. One of the major reasons for these drawbacks is the ''premature saturation'' which is a phenomenon that the error of the neural network stays significantly high constant for some period of time during learning. It is known to be caused by an inappropriate set of initial weights. In this paper, the probability of premature saturation at the beginning epoch of learning procedure in the BP algorithm has been derived in terms of the maximum value of initial weights, the number of nodes in each layer, and the maximum slope of the sigmoidal activation function; it has been verified by the Monte Carlo simulation. Using this result, the premature saturation can be avoided with proper initial weight settings.