Learning long-term dependencies with recurrent neural networks

  • Authors:
  • Anton Maximilian Schaefer;Steffen Udluft;Hans-Georg Zimmermann

  • Affiliations:
  • Information and Communications, Learning Systems, Siemens AG, Corporate Technology, 80200 Munich, Germany and Neuroinformatics Group, University of Osnabrueck, 49069 Osnabrueck, Germany;Information and Communications, Learning Systems, Siemens AG, Corporate Technology, 80200 Munich, Germany;Information and Communications, Learning Systems, Siemens AG, Corporate Technology, 80200 Munich, Germany

  • Venue:
  • Neurocomputing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Recurrent neural networks (RNN) unfolded in time are in theory able to map any open dynamical system. Still, they are often blamed to be unable to identify long-term dependencies in the data. Especially when they are trained with backpropagation it is claimed that RNN unfolded in time fail to learn inter-temporal influences more than 10 time steps apart. This paper refutes this often cited statement by giving counter-examples. We show that basic time-delay RNN unfolded in time and formulated as state space models are indeed capable of learning time lags of at least a 100 time steps. We point out that they even possess a self-regularisation characteristic, which adapts the internal error backflow, and analyse their optimal weight initialisation. In addition, we introduce the idea of inflation for modelling of long- and short-term memory and demonstrate that this technique further improves the performance of RNN.