Learning language from its perceptual context

Authors:
Raymond J. Mooney
Affiliations:
Department of Computer Science, The University of Texas at Austin, Austin, TX
Venue:
PADL'11 Proceedings of the 13th international conference on Practical aspects of declarative languages
Year:
2011

Citing 6
Cited 0

The symbol grounding problem

CNLS '89 Proceedings of the ninth annual international conference of the Center for Nonlinear Studies on Self-organizing, Collective, and Cooperative Phenomena in Natural and Artificial Computing Networks on Emergent computation
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Learning language semantics from ambiguous supervision

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Training a multilingual sportscaster: using perceptual context to learn language

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current systems that learn to process natural language require laboriously constructed human-annotated training data. Ideally, a computer would be able to acquire language like a child by being exposed to linguistic input in the context of a relevant but ambiguous perceptual environment. As a step in this direction, we present a system that learns to sportscast simulated robot soccer games by example. The training data consists of textual human commentaries on Robocup simulation games. A set of possible alternative meanings for each comment is automatically constructed from game event traces. Our previously developed systems for learning to parse and generate natural language (KRISP and WASP) were augmented to learn from this data and then commentate novel games. Using this approach, the system has learned to sportscast in both English and Korean. The system has been evaluated based on its ability to properly match sentences to the events being described, parse sentences into correct meanings, and generate accurate linguistic descriptions of events. Human evaluation was also conducted on the overall quality of the generated sportscasts and compared to human-generated commentaries, demonstrating that its sportscasts are on par with those generated by humans.