A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Attentional Selection for Object Recognition A Gentle Way
BMCV '02 Proceedings of the Second International Workshop on Biologically Motivated Computer Vision
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Context-Based Detection of Keypoints and Features in Eye Regions
ICPR '96 Proceedings of the 13th International Conference on Pattern Recognition - Volume 2
Is bottom-up attention useful for object recognition?
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Dynamic visual attention model in image sequences
Image and Vision Computing
A robust approach to segment desired object based on salient colors
Journal on Image and Video Processing - Color in Image and Video Processing
Object recognition and segmentation in videos by connecting heterogeneous visual features
Computer Vision and Image Understanding
Hi-index | 0.00 |
Bottom-up visual attention allows primates to quickly select regions of an image that contain salient objects. In artificial systems, restricting the task of object recognition to these regions allows faster recognition and unsupervised learning of multiple objects in cluttered scenes. A problem with this approach is that objects superficially dissimilar to the target are given the same consideration in recognition as similar objects. In video, objects recognized in previous frames at locations distant to the current fixation point are given the same consideration in recognition as objects previously recognized in locations closer to the current target of attention. Due to the continuity of smooth motion, objects recently recognized in previous frames at locations close to the current focus of attention have a high probability of matching the current target. Here we investigate rapid pruning of the facial recognition search space using the already-computed low-level features that guide attention and spatial information derived from previous video frames. For each video frame, Itti & Koch's bottom-up visual attention algorithm is used to select salient locations based on low-level features such as contrast, orientation, color, intensity, flicker and motion. This algorithm has shown to be highly effective in selecting faces as salient objects. Lowe's SIFT object recognition algorithm then extracts a signature of the attended object, for comparison with the facial database. The database search is prioritized for faces which better match the low-level features used to guide attention to the current candidate for recognition or those that were previously recognized near the current candidate's location. The SIFT signatures of the prioritized faces are then checked against the attended candidate for a match. By comparing performance of Lowe's recognition algorithm and Itti & Koch's bottom-up attention model with or without search space pruning we demonstrate that our pruning approach improves the speed of facial recognition in video footage.