Robust Adaptive Gradient-Descent Training Algorithm for Recurrent Neural Networks in Discrete Time Domain

Authors:
Qing Song;Yilei Wu;Yeng Chai Soh
Affiliations:
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
2008

Citing 0
Cited 6

Segmented-memory recurrent neural networks

IEEE Transactions on Neural Networks
A robust extended Elman backpropagation algorithm

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Pipelined Chebyshev functional link artificial recurrent neural network for nonlinear adaptive filter

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
On the weight convergence of Elman networks

IEEE Transactions on Neural Networks
Quantized Neural Modeling: Hybrid Quantized Architecture in Elman Networks

Neural Processing Letters
A linear recurrent kernel online learning algorithm with sparse updates

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

For a recurrent neural network (RNN), its transient response is a critical issue, especially for real-time signal processing applications. The conventional RNN training algorithms, such as backpropagation through time (BPTT) and real-time recurrent learning (RTRL), have not adequately addressed this problem because they suffer from low convergence speed. While increasing the learning rate may help to improve the performance of the RNN, it can result in unstable training in terms of weight divergence. Therefore, an optimal tradeoff between RNN training speed and weight convergence is desired. In this paper, a robust adaptive gradient-descent (RAGD) training algorithm of RNN is developed based on a novel RNN hybrid training concept. It switches the training patterns between standard real-time online backpropagation (BP) and RTRL according to the derived convergence and stability conditions. The weight convergence and L 2-stability of the algorithm are derived via the conic sector theorem. The optimized adaptive learning maximizes the training speed of the RNN for each weight update without violating the stability and convergence criteria. Computer simulations are carried out to demonstrate the applicability of the theoretical results.