Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Diagrammatic Methods for Deriving and Relating Temporal Neural Network Algorithms
Adaptive Processing of Sequences and Data Structures, International Summer School on Neural Networks, "E.R. Caianiello"-Tutorial Lectures
Diagrammatic derivation of gradient algorithms for neural networks
Neural Computation
IEEE Transactions on Signal Processing
On-line learning algorithms for locally recurrent neural networks
IEEE Transactions on Neural Networks
Gradient calculations for dynamic recurrent neural networks: a survey
IEEE Transactions on Neural Networks
Higher-order differentiation of network functions using signal flow graphs
International Journal of Circuit Theory and Applications
Hi-index | 0.00 |
A large class of nonlinear dynamic adaptive systems such as dynamic recurrent neural networks can be effectively represented by signal flow graphs (SFGs). By this method, complex systems are described as a general connection of many simple components, each of them implementing a simple one-input, one-output transformation, as in an electrical circuit. Even if graph representations are popular in the neural network community, they are often used for qualitative description rather than for rigorous representation and computational purposes. In this article, a method for both on-line and batch-backward gradient computation of a system output or cost function with respect to system parameters is derived by the SFG representation theory and its known properties. The system can be any causal, in general nonlinear and time-variant, dynamic system represented by an SFG, in particular any feedforward, time-delay, or recurrent neural network. In this work, we use discrete-time notation, but the same theory holds for the continuous-time case. The gradient is obtained in a straightforward way by the analysis of two SFGs, the original one and its adjoint (obtained from the first by simple transformations), without the complex chain rule expansions of derivatives usually employed. This method can be used for sensitivity analysis and for learning both off-line and on-line. On-line learning is particularly important since it is required by many real applications, such as digital signal processing, system identification and control, channel equalization, and predistortion.