Distributed Representations, Simple Recurrent Networks, And Grammatical Structure
Machine Learning - Connectionist approaches to language learning
Strong systematicity in sentence processing by an echo state network
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
CiE '07 Proceedings of the 3rd conference on Computability in Europe: Computation and Logic in the Real World
Benchmarking reservoir computing on time-independent classification tasks
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Training recurrent connectionist models on symbolic time series
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Improving the state space organization of untrained recurrent networks
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
What makes a brain smart? reservoir computing as an approach for general intelligence
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Simple deterministically constructed cycle reservoirs with regular jumps
Neural Computation
On-Line processing of grammatical structure using reservoir computing
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part I
Short term memory in input-driven linear dynamical systems
Neurocomputing
Hi-index | 0.00 |
Echo State Networks (ESNs) have been shown to be effective for a number of tasks, including motor control, dynamic time series prediction, and memorizing musical sequences. However, their performance on natural language tasks has been largely unexplored until now. Simple Recurrent Networks (SRNs) have a long history in language modeling and show a striking similarity in architecture to ESNs. A comparison of SRNs and ESNs on a natural language task is therefore a natural choice for experimentation. Elman applies SRNs to a standard task in statistical NLP: predicting the next word in a corpus, given the previous words. Using a simple context-free grammar and an SRN with backpropagation through time (BPTT), Elman showed that the network was able to learn internal representations that were sensitive to linguistic processes that were useful for the prediction task. Here, using ESNs, we show that training such internal representations is unnecessary to achieve levels of performance comparable to SRNs. We also compare the processing capabilities of ESNs to bigrams and trigrams. Due to some unexpected regularities of Elman's grammar, these statistical techniques are capable of maintaining dependencies over greater distances than might be initially expected. However, we show that the memory of ESNs in this word-prediction task, although noisy, extends significantly beyond that of bigrams and trigrams, enabling ESNs to make good predictions of verb agreement at distances over which these methods operate at chance. Overall, our results indicate a surprising ability of ESNs to learn a grammar, suggesting that they form useful internal representations without learning them.