A robot that uses existing vocabulary to infer non-visual word meanings from observation

Authors:
Kevin Gold;Brian Scassellati
Affiliations:
Department of Computer Science, Yale University, New Haven, CT;Department of Computer Science, Yale University, New Haven, CT
Venue:
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Year:
2007

Citing 7
Cited 4

Statistical methods for speech recognition

Statistical methods for speech recognition
Foundations of statistical natural language processing

Foundations of statistical natural language processing
PROLOG and Natural Language Analysis

PROLOG and Natural Language Analysis
A Framework for Representing Knowledge

A Framework for Representing Knowledge
When push comes to shove: a computational model of the role of motor control in the acquisition of action verbs

When push comes to shove: a computational model of the role of motor control in the acquisition of action verbs
Robust Real-Time Face Detection

International Journal of Computer Vision
A multimodal learning interface for grounding spoken language in sensory perceptions

ACM Transactions on Applied Perception (TAP)

Learning Meaning Before Syntax

ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
Robotic vocabulary building using extension inference and implicit contrast

Artificial Intelligence
The oz of wizard: simulating the human for interaction research

Proceedings of the 4th ACM/IEEE international conference on Human robot interaction
Training a multilingual sportscaster: using perceptual context to learn language

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The authors present TWIG, a visually grounded word-learning system that uses its existing knowledge of vocabulary, grammar, and action schemas to help it learn the meanings of new words from its environment. Most systems built to learn word meanings from sensory data focus on the "base case" of learning words when the robot knows nothing, and do not incorporate grammatical knowledge to aid the process of inferring meaning. The present study shows how using existing language knowledge can aid the word-learning process in three ways. First, partial parses of sentences can focus the robot's attention on the correct item or relation in the environment. Second, grammatical inference can suggest whether a new word refers to a unary or binary relation. Third, the robot's existing predicate schemas can suggest possibilities for a new predicate. The authors demonstrate that TWIG can use its understanding of the phrase '"got the ball" while watching a game of catch to learn that "I" refers to the speaker, "you" refers to the addressee, and the names refer to particular people. The robot then uses these new words to learn that "am" and "are" refer to the identity relation.