2007 Special Issue: Learning grammatical structure with Echo State Networks

Authors:
Matthew H. Tong;Adam D. Bickett;Eric M. Christiansen;Garrison W. Cottrell
Affiliations:
Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, Dept 0404, San Diego, CA 92093-0404, USA;Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, Dept 0404, San Diego, CA 92093-0404, USA;Swarthmore College, 500 College Avenue, Swarthmore, PA 19081, USA;Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, Dept 0404, San Diego, CA 92093-0404, USA
Venue:
Neural Networks
Year:
2007

Citing 3
Cited 10

Distributed Representations, Simple Recurrent Networks, And Grammatical Structure

Machine Learning - Connectionist approaches to language learning
Real-time computing without stable states: a new framework for neural computation based on perturbations

Neural Computation
Strong systematicity in sentence processing by an echo state network

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I

Liquid Computing

CiE '07 Proceedings of the 3rd conference on Computability in Europe: Computation and Logic in the Real World
Benchmarking reservoir computing on time-independent classification tasks

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Training recurrent connectionist models on symbolic time series

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Improving the state space organization of untrained recurrent networks

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
What makes a brain smart? reservoir computing as an approach for general intelligence

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Simple deterministically constructed cycle reservoirs with regular jumps

Neural Computation
Control of discrete chaotic systems based on echo state network modeling with an adaptive noise canceler

Knowledge-Based Systems
On-Line processing of grammatical structure using reservoir computing

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part I
Short term memory in input-driven linear dynamical systems

Neurocomputing
Using Laplacian Eigenmap as Heuristic Information to Solve Nonlinear Constraints Defined on a Graph and Its Application in Distributed Range-Free Localization of Wireless Sensor Networks

Neural Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Echo State Networks (ESNs) have been shown to be effective for a number of tasks, including motor control, dynamic time series prediction, and memorizing musical sequences. However, their performance on natural language tasks has been largely unexplored until now. Simple Recurrent Networks (SRNs) have a long history in language modeling and show a striking similarity in architecture to ESNs. A comparison of SRNs and ESNs on a natural language task is therefore a natural choice for experimentation. Elman applies SRNs to a standard task in statistical NLP: predicting the next word in a corpus, given the previous words. Using a simple context-free grammar and an SRN with backpropagation through time (BPTT), Elman showed that the network was able to learn internal representations that were sensitive to linguistic processes that were useful for the prediction task. Here, using ESNs, we show that training such internal representations is unnecessary to achieve levels of performance comparable to SRNs. We also compare the processing capabilities of ESNs to bigrams and trigrams. Due to some unexpected regularities of Elman's grammar, these statistical techniques are capable of maintaining dependencies over greater distances than might be initially expected. However, we show that the memory of ESNs in this word-prediction task, although noisy, extends significantly beyond that of bigrams and trigrams, enabling ESNs to make good predictions of verb agreement at distances over which these methods operate at chance. Overall, our results indicate a surprising ability of ESNs to learn a grammar, suggesting that they form useful internal representations without learning them.