Natural language learning by recurrent neural networks: a comparison with probabilistic approaches

Authors:
Michael Towsey;Joachim Diederich;Ingo Schellhammer;Stephan Chalup;Claudia Brugman
Affiliations:
Queensland University of Technology, QLD, Australia;Queensland University of Technology, QLD, Australia;Queensland University of Technology, QLD, Australia and University of Muenster, Muenster, Germany;Queensland University of Technology, QLD, Australia;University of Otago, Dunedin, New Zealand
Venue:
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Year:
1998

Citing 1
Cited 1

Natural language grammatical inference: a comparison of recurrent neural networks and machine learning methods

Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing

Intelligent steganalytic system: application on natural language environment

WSEAS Transactions on Systems and Control

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present preliminary results of experiments with two types of recurrent neural networks for a natural language learning task. The neural networks, Elman networks and Recurrent Cascade Correlation (RCC), were trained on the text of a first-year primary school reader. The networks performed a one-step-look-ahead task, i.e. they had to predict the lexical category of the next following word. Elman networks with 9 hidden units gave the best training results (72% correct) but scored only 63% when tested for generalisation using a "leave-one-sentence-out" cross-validation technique. An RCC network could learn 99.6% of the training set by adding up to 42 hidden units but achieved best generalisation (63%) with only four hidden units. Results are presented showing network learning in relation to bi-, tri-, 4- and 5-gram performance. Greatest prediction uncertainty (measured as the entropy of the output units) occurred, not at the sentence boundaries but when the first verb was the input.