A computational model for the alignment of hierarchical scene representations in human-robot interaction

Authors:
Agnes Swadzba;Sven Wachsmuth;Constanze Vorwerg;Gert Rickheit
Affiliations:
Applied Informatics, Bielefeld University;Applied Informatics, Bielefeld University;Faculty of Linguistics, Bielefeld University;Faculty of Linguistics, Bielefeld University
Venue:
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Year:
2009

Citing 12
Cited 1

Geometry and texture recovery of scenes of large scale

Computer Vision and Image Understanding
Bayesian networks for speech and image integration

Eighteenth national conference on Artificial intelligence
Experiences with a mobile robotic guide for the elderly

Eighteenth national conference on Artificial intelligence
Context-based vision system for place and object recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Robust Real-Time Face Detection

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Human-style interaction with a robot for cooperative learning of scene objects

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Using Extended EM to Segment Planar Structures in 3D

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 03
Supervised semantic labeling of places using information extracted from sensor data

Robotics and Autonomous Systems
Conceptual spatial representations for indoor mobile robots

Robotics and Autonomous Systems
Mediating between qualitative and quantitative representations for task-orientated human-robot interaction

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Dialog-based 3D-image recognition using a domain ontology

SC'06 Proceedings of the 2006 international conference on Spatial Cognition V: reasoning, action, interaction

Inferring robot goals from violations of semantic knowledge

Robotics and Autonomous Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ultimate goal of human-robot interaction is to enable the robot to seamlessly communicate with a human in a natural human-like fashion. Most work in this field concentrates on the speech interpretation and gesture recognition side assuming that a propositional scene representation is available. Less work was dedicated to the extraction of relevant scene structures that underlies these propositions. As a consequence, most approaches are restricted to place recognition or simple table top settings and do not generalize to more complex room setups. In this paper, we propose a hierarchical spatial model that is empirically motivated from psycholinguistic studies. Using this model the robot is able to extract scene structures from a time-of-flight depth sensor and adjust its spatial scene representation by taking verbal statements about partial scene aspects into account. Without assuming any pre-known model of the specific room, we show that the system aligns its sensor-based room representation to a semantically meaningful representation typically used by the human descriptor.