Toward learning perceptually grounded word meanings from unaligned parallel data

  • Authors:
  • Stefanie Tellex;Pratiksha Thaker;Josh Joseph;Matthew R. Walter;Nicholas Roy

  • Affiliations:
  • MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory

  • Venue:
  • SIAC '12 Proceedings of the Second Workshop on Semantic Interpretation in an Actionable Context
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order for robots to effectively understand natural language commands, they must be able to acquire a large vocabulary of meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach which is capable of jointly learning a policy for following natural language commands such as "Pick up the tire pallet," as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words "the tire pallet" and a specific object in the environment. We assume the action policy takes a parametric form that factors based on the structure of the language, based on the G3 framework and use stochastic gradient ascent to optimize policy parameters. Our preliminary evaluation demonstrates the effectiveness of the model on a corpus of "pick up" commands given to a robotic forklift by untrained users.