New results on recurrent network training: unifying the algorithms and accelerating convergence

Authors:
A. F. Atiya;A. G. Parlos
Affiliations:
Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA;-
Venue:
IEEE Transactions on Neural Networks
Year:
2000

Citing 0
Cited 31

The general inefficiency of batch training for gradient descent learning

Neural Networks
Online State--Space Modeling Using Recurrent Multilayer Perceptrons with Unscented Kalman Filter

Neural Processing Letters
Robust Recurrent Neural Network Control of Biped Robot

Journal of Intelligent and Robotic Systems
2007 Special Issue: Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation and echo state learning

Neural Networks
2007 Special Issue: An experimental unification of reservoir computing methods

Neural Networks
2007 Special Issue: Automatic speech recognition using a predictive echo state network classifier

Neural Networks
Improving reservoirs using intrinsic plasticity

Neurocomputing
Numerical bounds to assure initial local stability of NARX multilayer perceptrons and radial basis functions

Neurocomputing
Pruning and regularization in reservoir computing

Neurocomputing
Symbolic function network

Neural Networks
Stable adaptive control with recurrent neural networks for square MIMO non-linear systems

Engineering Applications of Artificial Intelligence
The Separation Property Enhancement of Liquid State Machine by Particle Swarm Optimization

ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part III
Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

Neurocomputing
Spurious valleys in the error surface of recurrent networks: analysis and avoidance

IEEE Transactions on Neural Networks
Nonlinear time series online prediction using reservoir Kalman filter

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A robust extended Elman backpropagation algorithm

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Analyzing the weight dynamics of recurrent learning algorithms

Neurocomputing
A comparison of selected training algorithms for recurrent neural networks

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Estimation of multidimensional regression model with multilayer perceptrons

IWANN'03 Proceedings of the Artificial and natural neural networks 7th international conference on Computational methods in neural modeling - Volume 1
On the weight convergence of Elman networks

IEEE Transactions on Neural Networks
Information processing in complex networks

IEEE Circuits and Systems Magazine - Special issue on complex networks applications in circuits and systems
Architectural and Markovian factors of echo state networks

Neural Networks
Memory in backpropagation-decorrelation O(N) efficient online recurrent learning

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
State prediction: a constructive method to program recurrent neural networks

ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
On the learning of ESN linear readouts

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Recurrent kernel machines: Computing with infinite echo state networks

Neural Computation
A neural network method for induction machine fault detection with vibration signal

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
Recurrent sparse support vector regression machines trained by active learning in the time-domain

Expert Systems with Applications: An International Journal
Simple deterministically constructed cycle reservoirs with regular jumps

Neural Computation
Learning in fully recurrent neural networks by approaching tangent planes to constraint surfaces

Neural Networks
Optimal design of neuro-mechanical oscillators

Computers and Structures

Quantified Score

Hi-index	0.00

Visualization

Abstract

How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner