Diagrammatic derivation of gradient algorithms for neural networks

  • Authors:
  • Eric A. Wan;Françoise Beaufays

  • Affiliations:
  • Department of Electrical Engineering and Applied Physics, Oregon Graduate Institute of Science & Technology, P.O. Box 91000, Portland, OR 97291 USA;Department of Electrical Engineering, Stanford University, Stanford, CA 94305-4055 USA

  • Venue:
  • Neural Computation
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Deriving gradient algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to derive such algorithms via a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time without a single chain rule expansion. Additional examples are provided for a variety of complicated architectures to illustrate both the generality and the simplicity of the approach.