A generalized LSTM-like training algorithm for second-order recurrent neural networks

Authors:
Derek Monner;James A. Reggia
Affiliations:
-;-
Venue:
Neural Networks
Year:
2012

Citing 11
Cited 2

Connectionist learning procedures

Artificial Intelligence
Long short-term memory

Neural Computation
Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Neural Networks
Learning to Forget: Continual Prediction with LSTM

Neural Computation
Training Recurrent Networks by Evolino

Neural Computation
A learning algorithm for continually running fully recurrent neural networks

Neural Computation
Evolving Memory Cell Structures for Sequence Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
An unsupervised learning method for representing simple sentences

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Systematically grounding language through vision in a deep, recurrent neural network

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
LSTM recurrent networks learn simple context-free and context-sensitive languages

IEEE Transactions on Neural Networks
Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks

IEEE Transactions on Neural Networks

2013 Special Issue: Controlling working memory with learned instructions

Neural Networks
Modeling a shape memory alloy actuator using an evolvable recursive black-box and hybrid heuristic algorithms inspired based on the annual migration of salmons in nature

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the Generalized Long Short-Term Memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.