From bottom-Up visual attention to robot action learning

Authors:
Yukie Nagai
Affiliations:
Research Institute for Cognition and Robotics, Bielefeld University, 33594, Germany
Venue:
DEVLRN '09 Proceedings of the 2009 IEEE 8th International Conference on Development and Learning
Year:
2009

Citing 0
Cited 4

Computational visual attention systems and their cognitive foundations: A survey

ACM Transactions on Applied Perception (TAP)
Stability and sensitivity of bottom-up visual attention for dynamic scene analysis

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
The active sampling of gaze-shifts

ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing: Part I
A saliency map based on sampling an image into random rectangular regions of interest

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This research addresses the challenge of developing an action learning model employing bottom-up visual attention. Although bottom-up attention enables robots to autonomously explore the environment, learn to recognize objects, and interact with humans, the instability of their attention as well as the poor quality of the information detected at the attentional location has hindered the robots from processing dynamic movements. In order to learn actions, robots have to stably attend to the relevant movement by ignoring noises while maintaining sensitivity to a new important movement. To meet these contradictory requirements, I introduce mechanisms for retinal filtering and stochastic attention selection inspired by human vision. The former reduces the complexity of the peripheral vision and thus enables robots to focus more on the currently-attended location. The latter allows robots to flexibly shift their attention to a new prominent location, which must be relevant to the demonstrated action. The signals detected at the attentional location are then enriched based on the spatial and temporal continuity so that robots can learn to recognize objects, movements, and their associations. Experimental results show that the proposed system can extract key actions from human action demonstrations.