Where to go: interpreting natural directions using global inference

  • Authors:
  • Yuan Wei;Emma Brunskill;Thomas Kollar;Nicholas Roy

  • Affiliations:
  • Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA

  • Venue:
  • ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important component of human-robot interaction is that people need to be able to instruct robots to move to other locations using naturally given directions. When giving directions, people often make mistakes such as labelling errors (e.g., left vs. right) and errors of omission (skipping important decision points in a sequence). Furthermore, people often use multiple levels of granularity in specifying directions, referring to locations using single object landmarks, multiple landmarks in a given location, or identifying large regions as a single location. The challenge is to identify the correct path to a destination from a sequence of noisy, possibly erroneous directions. In our work we cast this problem as probabilistic inference: given a set of directions, an agent should automatically find the path with the geometry and physical appearance to maximize the likelihood of those directions. We use a specific variant of a Markov Random Field (MRF) to represent our model, and gather multi-granularity representation information using existing large tagged datasets. On a dataset of route directions collected in a large third floor university building, we found that our algorithm correctly inferred the true final destination in 47 out of the 55 cases successfully followed by humans volunteers. These results suggest that our algorithm is performing well relative to human users. In the future this work will be included in a broader system for autonomously constructing environmental representations that support natural human-robot interaction for direction giving.