Cascaded Sequential Attention for Object Recognition with Informative Local Descriptors and Q-learning of Grouping Strategies

Authors:
Lucas Paletta;Gerald Fritz;Christin Seifert
Affiliations:
Institute of Digital Image Processing, JOANNEUM RESEARCH;Institute of Digital Image Processing, JOANNEUM RESEARCH;Institute of Digital Image Processing, JOANNEUM RESEARCH
Venue:
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops - Volume 03
Year:
2005

Citing 0
Cited 7

Comparing Learning Attention Control in Perceptual and Decision Space

Attention in Cognitive Systems
Image ordering by cellular genetic algorithms with TSP and ICA

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Learning sequential visual attention control through dynamic state space discretization

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Online learning of task-driven object-based visual attention control

Image and Vision Computing
Attention to multiple local critics in decision making and control

Expert Systems with Applications: An International Journal
Towards active event recognition

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
METAL: A framework for mixture-of-experts task and attention learning

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

The contribution of this work is to provide a three-stage architecture for sequential attention to provide a system being capable of sensorimotor object detection in real world environments. The first processing stage provides selected foci of interest in the image based on the extraction of information theoretic saliency of local image descriptors (i-SIFT). The second stage investigates the information in the local attention window using a codebook matcher, providing local weak hypotheses about the identity of the object under investigation. The third stage then proposes a shift of attention to a next attention window. The working hypothesis is to expect a better discrimination from the integration of both the individual local FOA patterns and the geometric relation between them, providing a model of more global information representation, and feeding into a recognition state in the Markov Decision Process (MDP). A reinforcement learner (Q-learner) performs then explorative search on useful actions, i.e., shifts of attention, towards locations of salient information, developing a strategy of useful action sequences being directed in state space towards the optimization of discrimination by information maximization. The method is evaluated in experiments using the COIL-20 database (indoor imagery) and the TSG-20 database (outdoor imagery) to demonstrate efficient performance in object detection tasks, proving the method being more accurate and computationally much less expensive than standard SIFT based recognition.