Geometry and texture recovery of scenes of large scale
Computer Vision and Image Understanding
Bayesian networks for speech and image integration
Eighteenth national conference on Artificial intelligence
Experiences with a mobile robotic guide for the elderly
Eighteenth national conference on Artificial intelligence
Context-based vision system for place and object recognition
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Robust Real-Time Face Detection
International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Human-style interaction with a robot for cooperative learning of scene objects
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Using Extended EM to Segment Planar Structures in 3D
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 03
Supervised semantic labeling of places using information extracted from sensor data
Robotics and Autonomous Systems
Conceptual spatial representations for indoor mobile robots
Robotics and Autonomous Systems
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Dialog-based 3D-image recognition using a domain ontology
SC'06 Proceedings of the 2006 international conference on Spatial Cognition V: reasoning, action, interaction
Inferring robot goals from violations of semantic knowledge
Robotics and Autonomous Systems
Hi-index | 0.00 |
The ultimate goal of human-robot interaction is to enable the robot to seamlessly communicate with a human in a natural human-like fashion. Most work in this field concentrates on the speech interpretation and gesture recognition side assuming that a propositional scene representation is available. Less work was dedicated to the extraction of relevant scene structures that underlies these propositions. As a consequence, most approaches are restricted to place recognition or simple table top settings and do not generalize to more complex room setups. In this paper, we propose a hierarchical spatial model that is empirically motivated from psycholinguistic studies. Using this model the robot is able to extract scene structures from a time-of-flight depth sensor and adjust its spatial scene representation by taking verbal statements about partial scene aspects into account. Without assuming any pre-known model of the specific room, we show that the system aligns its sensor-based room representation to a semantically meaningful representation typically used by the human descriptor.