Mimicking visual searching with integrated top down cues and low-level features

Authors:
Jiawei Xu;Shigang Yue
Affiliations:
-;-
Venue:
Neurocomputing
Year:
2014

Citing 5
Cited 0

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
A user attention model for video summarization

Proceedings of the tenth ACM international conference on Multimedia
Computation with spikes in a winner-take-all network

Neural Computation
Saliency detection for content-aware image resizing

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Top-down visual saliency via joint CRF and dictionary learning

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Visual searching is a perception task involved with visual attention, attention shift and active scan of the visual environment for a particular object or feature. The key idea of our paper is to mimic the human visual searching under the static and dynamic scenes. To build up an artificial vision system that performs the visual searching could be helpful to medical and psychological application development to human machine interaction. Recent state-of-the-art researches focus on the bottom-up and top-down saliency maps. Saliency maps indicate that the saliency likelihood of each pixel, however, understanding the visual searching process can help an artificial vision system exam details in a way similar to human and they will be good for future robots or machine vision systems which is a deeper digest than the saliency map. This paper proposed a computational model trying to mimic human visual searching process and we emphasis the motion cues on the visual processing and searching. Our model analysis the attention shifts by fusing the top-down bias and bottom-up cues. This model also takes account the motion factor into the visual searching processing. The proposed model involves five modules: the pre-learning process; top-down biasing; bottom-up mechanism; multi-layer neural network and attention shifts. Experiment evaluation results via benchmark databases and real-time video showed the model demonstrated high robustness and real-time ability under complex dynamic scenes.