Learning long-term dependencies in NARX recurrent neural networks

Authors:
Tsungnan Lin;B. G. Horne;P. Tino;C. L. Giles
Affiliations:
NEC Res. Inst., Princeton, NJ;-;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
1996

Citing 0
Cited 40

Embedding Connectionist Autonomous Agents in Time: The ‘Road Sign Problem’

Neural Processing Letters
Advanced RNN Based NARMA Predictors

Journal of VLSI Signal Processing Systems
Sufficient conditions for error backflow convergence in dynamical recurrent neural networks

Neural Computation
Extended Kalman Filter Trained Recurrent Radial Basis Function Network in Nonlinear System Identification

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Applying LSTM to Time Series Predictable through Time-Window Approaches

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Localization of Sound Sources by Means of Recurrent Neural Networks

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
Bidirectional Dynamics for Protein Secondary Structure Prediction

Sequence Learning - Paradigms, Algorithms, and Applications
On the Need for a Neural Abstract Machine

Sequence Learning - Paradigms, Algorithms, and Applications
A taxonomy for spatiotemporal connectionist networks revisited: the unsupervised case

Neural Computation
Genetic design of discrete dynamical basis networks that approximate data sequences and functions

International Journal of Systems Science
Analyzing Holistic Parsers: Implications for Robust Parsing and Systematicity

Neural Computation
Spatiotemporal Connectionist Networks: A Taxonomy and Review

Neural Computation
Learning to Forget: Continual Prediction with LSTM

Neural Computation
Learning Chaotic Attractors by Neural Networks

Neural Computation
How to Design a Connectionist Holistic Parser

Neural Computation
On the construction of a nonlinear recursive predictor

Journal of Computational and Applied Mathematics - Special issue: International conference on mathematics and its application
Long Short-Term Memory

Neural Computation
2007 Special Issue: Recurrent neural network modeling of nearshore sandbar behavior

Neural Networks
Locally recurrent neural networks for wind speed prediction using spatial correlation

Information Sciences: an International Journal
A new boosting algorithm for improved time-series forecasting with recurrent neural networks

Information Fusion
Cascaded Bidirectional Recurrent Neural Networks for Protein Secondary Structure Prediction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Prediction of chaotic time series with NARX recurrent dynamic neural networks

ICAI'08 Proceedings of the 9th WSEAS International Conference on International Conference on Automation and Information
Long-term time series prediction with the NARX network: An empirical evaluation

Neurocomputing
Recurrent Neural Networks on Duty of Anomaly Detection in Databases

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks, Part III
Some Issues on Intrusion Detection in Web Applications

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
The hidden neurons selection of the wavelet networks using support vector machines and ridge regression

Neurocomputing
Numerical bounds to assure initial local stability of NARX multilayer perceptrons and radial basis functions

Neurocomputing
The use of NARX neural networks to predict chaotic time series

WSEAS Transactions on Computer Research
Group-Linking Method: A Unified Benchmark for Machine Learning with Recurrent Neural Network

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Processing short-term and long-term information with a combination of polynomial approximation techniques and time-delay neural networks

IEEE Transactions on Neural Networks
Segmented-memory recurrent neural networks

IEEE Transactions on Neural Networks
On the construction of a nonlinear recursive predictor

Journal of Computational and Applied Mathematics
A dynamic system approach for radio location fingerprinting in wireless local area networks

IEEE Transactions on Communications
Learning from demonstration in robots: Experimental comparison of neural architectures

Robotics and Computer-Integrated Manufacturing
Time delay learning by gradient descent in recurrent neural networks

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario

ACM Transactions on Speech and Language Processing (TSLP)
Nonlinear modeling of dynamic cerebral autoregulation using recurrent neural networks

CIARP'05 Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications
Neural network based modelling of environmental variables: A systematic approach

Mathematical and Computer Modelling: An International Journal
Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Computer Speech and Language
Selective Recurrent Neural Network

Neural Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

It has previously been shown that gradient-descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. We show that the long-term dependencies problem is lessened for a class of architectures called nonlinear autoregressive models with exogenous (NARX) recurrent neural networks, which have powerful representational capabilities. We have previously reported that gradient descent learning can be more effective in NARX networks than in recurrent neural network architectures that have “hidden states” on problems including grammatical inference and nonlinear system identification. Typically, the network converges much faster and generalizes better than other networks. The results in this paper are consistent with this phenomenon. We present some experimental results which show that NARX networks can often retain information for two to three times as long as conventional recurrent neural networks. We show that although NARX networks do not circumvent the problem of long-term dependencies, they can greatly improve performance on long-term dependency problems. We also describe in detail some of the assumptions regarding what it means to latch information robustly and suggest possible ways to loosen these assumptions