Where to go: interpreting natural directions using global inference

Authors:
Yuan Wei;Emma Brunskill;Thomas Kollar;Nicholas Roy
Affiliations:
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA
Venue:
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Year:
2009

Citing 8
Cited 5

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
When and Why Are Visual Landmarks Used in Giving Directions?

COSIT 2001 Proceedings of the International Conference on Spatial Information Theory: Foundations of Geographic Information Science
Integrating Vision and Spatial Reasoning for Assistive Navigation

Assistive Technology and Artificial Intelligence, Applications in Robotics, User Interfaces and Natural Language Processing
Conceptual spatial representations for indoor mobile robots

Robotics and Autonomous Systems
Walk the talk: connecting language, knowledge, and action in route instructions

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Towards a semantic spatial model for pedestrian indoor navigation

ER'07 Proceedings of the 2007 conference on Advances in conceptual modeling: foundations and applications
Specification of an ontology for route graphs

SC'04 Proceedings of the 4th international conference on Spatial Cognition: reasoning, Action, Interaction
Spatial language for human-robot dialogs

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Robots asking for directions: the willingness of passers-by to support robots

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Following directions using statistical machine translation

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Toward understanding natural language directions

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Learning to follow navigational directions

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A communication structure for human-robot itinerary requests

Proceedings of the 6th international conference on Human-robot interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important component of human-robot interaction is that people need to be able to instruct robots to move to other locations using naturally given directions. When giving directions, people often make mistakes such as labelling errors (e.g., left vs. right) and errors of omission (skipping important decision points in a sequence). Furthermore, people often use multiple levels of granularity in specifying directions, referring to locations using single object landmarks, multiple landmarks in a given location, or identifying large regions as a single location. The challenge is to identify the correct path to a destination from a sequence of noisy, possibly erroneous directions. In our work we cast this problem as probabilistic inference: given a set of directions, an agent should automatically find the path with the geometry and physical appearance to maximize the likelihood of those directions. We use a specific variant of a Markov Random Field (MRF) to represent our model, and gather multi-granularity representation information using existing large tagged datasets. On a dataset of route directions collected in a large third floor university building, we found that our algorithm correctly inferred the true final destination in 47 out of the 55 cases successfully followed by humans volunteers. These results suggest that our algorithm is performing well relative to human users. In the future this work will be included in a broader system for autonomously constructing environmental representations that support natural human-robot interaction for direction giving.