Error entropy minimization for LSTM training

Authors:
Luís A. Alexandre;J. P. Marques de Sá
Affiliations:
Department of Informatics and IT-Networks and Multimedia Group, University of Beira Interior, Covilhã, Portugal;Faculty of Engineering and INEB, University of Porto, Portugal
Venue:
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Year:
2006

Citing 6
Cited 2

Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets

Neural Networks
Recurrent Nets that Time and Count

IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 3 - Volume 3
Learning to Forget: Continual Prediction with LSTM

Neural Computation
Long Short-Term Memory

Neural Computation
An error-entropy minimization algorithm for supervised training ofnonlinear adaptive systems

IEEE Transactions on Signal Processing

The mee principle in data classification: A perceptron-based analysis

Neural Computation
Single layer complex valued neural network with entropic cost function

ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a new training algorithm for the Long Short-Term Memory (LSTM) recurrent neural network. This algorithm uses entropy instead of the usual mean squared error as the cost function for the weight update. More precisely we use the Error Entropy Minimization approach, were the entropy of the error is minimized after each symbol is present to the network. Our experiments show that this approach enables the convergence of the LSTM more frequently than with the traditional learning algorithm. This in turn relaxes the burden of parameter tuning since learning is achieved for a wider range of parameter values. The use of EEM also reduces, in some cases, the number of epochs needed for convergence.