Selective visual attention enables learning and recognition of multiple objects in cluttered scenes

Authors:
Dirk Walther;Ueli Rutishauser;Christof Koch;Pietro Perona
Affiliations:
Comput. and Neural Syst. Prog., 139-74, California Institute of Technology, Pasadena, CA 91125, USA;Comput. and Neural Syst. Prog., 139-74, California Institute of Technology, Pasadena, CA 91125, USA;Comput. and Neural Syst. Prog., 139-74, California Institute of Technology, Pasadena, CA 91125, USA and Div. of Biology, California Institute of Technology, Pasadena, CA 91125, USA;Comput. and Neural Syst. Prog., 139-74, California Institute of Technology, Pasadena, CA 91125, USA and Dept. of Electr. Engin., 136-93, California Institute of Technology, Pasadena, CA 91125, USA
Venue:
Computer Vision and Image Understanding - Special issue: Attention and performance in computer vision
Year:
2005

Citing 23
Cited 26

Modeling visual attention via selective tuning

Artificial Intelligence - Special volume on computer vision
Active object recognition integrating attention and viewpoint control

Computer Vision and Image Understanding
Neural Network-Based Face Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Example-Based Learning for View-Based Human Face Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Trainable System for Object Detection

International Journal of Computer Vision - special issue on learning and vision at the center for biological and computational learning, Massachusetts Institute of Technology
Saliency, Scale and Image Description

International Journal of Computer Vision
A Goal Oriented Attention Guidance Model

BMCV '02 Proceedings of the Second International Workshop on Biologically Motivated Computer Vision
Attentional Selection for Object Recognition A Gentle Way

BMCV '02 Proceedings of the Second International Workshop on Biologically Motivated Computer Vision
Object-based visual attention for computer vision

Artificial Intelligence
Saliency maps and attention selection in scale and spatial coordinates: an information theoretic approach

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
The steerable pyramid: a flexible architecture for multi-scale derivative computation

ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
A statistical approach to 3d object detection applied to faces and cars

A statistical approach to 3d object detection applied to faces and cars
Face recognition: component-based versus global approaches

Computer Vision and Image Understanding - Special issue on Face recognition
Robust Real-Time Face Detection

International Journal of Computer Vision
Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues

IEEE Transactions on Pattern Analysis and Machine Intelligence
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
3d object modeling and recognition in photographs and video

3d object modeling and recognition in photographs and video
Is bottom-up attention useful for object recognition?

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Recovering human body configurations: combining segmentation and recognition

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Efficient deformable filter banks

IEEE Transactions on Signal Processing

Stereo Saliency Map Considering Affective Factors in a Dynamic Environment

Neural Information Processing
Modeling Attention and Perceptual Grouping to Salient Objects

Attention in Cognitive Systems
Approaches and Challenges for Cognitive Vision Systems

Creating Brain-Like Intelligence
Improving Scene Recognition through Visual Attention

IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Familiarity based unified visual attention model for fast and robust object recognition

Pattern Recognition
Selective Attention Improves Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Most salient region tracking

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Towards a unified visual framework in a binocular active robot vision system

Robotics and Autonomous Systems
Online learning of task-driven object-based visual attention control

Image and Vision Computing
Focusing computational visual attention in multi-modal human-robot interaction

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Autonomous behavior-based switched top-down and bottom-up visual attention for mobile robots

IEEE Transactions on Robotics
Probabilistic learning of visual object composition from attended segments

ISVC'10 Proceedings of the 6th international conference on Advances in visual computing - Volume Part II
Affective saliency map considering psychological distance

Neurocomputing
Fast object detection using steiner tree

Machine Graphics & Vision International Journal
Linear vs. nonlinear feature combination for saliency computation: a comparison with human vision

DAGM'06 Proceedings of the 28th conference on Pattern Recognition
Generic solution for image object recognition based on vision cognition theory

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Automatic selection and detection of visual landmarks using multiple segmentations

PSIVT'06 Proceedings of the First Pacific Rim conference on Advances in Image and Video Technology
GravNav: using a gravity model for multi-scale navigation

Proceedings of the International Working Conference on Advanced Visual Interfaces
A salience-based quality metric for visualization

EuroVis'10 Proceedings of the 12th Eurographics / IEEE - VGTC conference on Visualization
Guiding attention in controlled real-world environments

Proceedings of the ACM Symposium on Applied Perception
A critical review of selective attention: an interdisciplinary perspective

Artificial Intelligence Review
A two-layer framework for appearance based recognition using spatial and discriminant influences

Neurocomputing
Towards active event recognition

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Multi-spectral dataset and its application in saliency detection

Computer Vision and Image Understanding
Tag-Saliency: Combining bottom-up and top-down information for saliency detection

Computer Vision and Image Understanding
An integrative approach to accurate vehicle logo detection

Journal of Electrical and Computer Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

A key problem in learning representations of multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter. Distinguishing individual objects in a scene would allow unsupervised learning of multiple objects from unlabeled images. There is psychophysical and neurophysiological evidence that the brain employs visual attention to select relevant parts of the image and to serialize the perception of individual objects. We propose a method for the selection of salient regions likely to contain objects, based on bottom-up visual attention. By comparing the performance of David Lowe's recognition algorithm with and without attention, we demonstrate in our experiments that the proposed approach can enable one-shot learning of multiple objects from complex scenes, and that it can strongly improve learning and recognition performance in the presence of large amounts of clutter.