Peripheral-foveal vision for real-time object recognition and tracking in video

Authors:
Stephen Gould;Joakim Arfvidsson;Adrian Kaehler;Benjamin Sapp;Marius Messner;Gary Bradski;Paul Baumstarck;Sukwon Chung;Andrew Y. Ng
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 8
Cited 8

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Flexible New Technique for Camera Calibration

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Maximum-Likelihood Strategy for Directing Attention during Visual Search

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Real-Time Face Detection

International Journal of Computer Vision
Object-based Visual Attention: a Model for a Behaving Robot

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops - Volume 03
Tractable inference for complex stochastic processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Salient region detection using weighted feature maps based on the human visual attention model

PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II

Robotic Grasping of Novel Objects using Vision

International Journal of Robotics Research
Color in image and video processing: most recent trends and future research directions

Journal on Image and Video Processing - Color in Image and Video Processing
Object recognition from omnidirectional visual sensing for mobile robot applications

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Cognitive vision for efficient scene processing and object categorization in highly cluttered environments

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Multiple viewpoint recognition and localization

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part I
Depth from vergence and active calibration for humanoid robots

ACIVS'12 Proceedings of the 14th international conference on Advanced Concepts for Intelligent Vision Systems
Young adult health promotion: supporting research design with eye-tracking methodologies

HCI International'13 Proceedings of the 15th international conference on Human Interface and the Management of Information: information and interaction for health, safety, mobility and complex environments - Volume Part II
Towards active event recognition

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Human object recognition in a physical 3-d environment is still far superior to that of any robotic vision system. We believe that one reason (out of many) for this--one that has not heretofore been significantly exploited in the artificial vision literature--is that humans use a fovea to fixate on, or near an object, thus obtaining a very high resolution image of the object and rendering it easy to recognize. In this paper, we present a novel method for identifying and tracking objects in multiresolution digital video of partially cluttered environments. Our method is motivated by biological vision systems and uses a learned "attentive" interest map on a low resolution data stream to direct a high resolution "fovea." Objects that are recognized in the fovea can then be tracked using peripheral vision. Because object recognition is run only on a small foveal image, our system achieves performance in real-time object recognition and tracking that is well beyond simpler systems.