A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
A user attention model for video summarization
Proceedings of the tenth ACM international conference on Multimedia
Computation with spikes in a winner-take-all network
Neural Computation
Saliency detection for content-aware image resizing
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Top-down visual saliency via joint CRF and dictionary learning
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Hi-index | 0.01 |
Visual searching is a perception task involved with visual attention, attention shift and active scan of the visual environment for a particular object or feature. The key idea of our paper is to mimic the human visual searching under the static and dynamic scenes. To build up an artificial vision system that performs the visual searching could be helpful to medical and psychological application development to human machine interaction. Recent state-of-the-art researches focus on the bottom-up and top-down saliency maps. Saliency maps indicate that the saliency likelihood of each pixel, however, understanding the visual searching process can help an artificial vision system exam details in a way similar to human and they will be good for future robots or machine vision systems which is a deeper digest than the saliency map. This paper proposed a computational model trying to mimic human visual searching process and we emphasis the motion cues on the visual processing and searching. Our model analysis the attention shifts by fusing the top-down bias and bottom-up cues. This model also takes account the motion factor into the visual searching processing. The proposed model involves five modules: the pre-learning process; top-down biasing; bottom-up mechanism; multi-layer neural network and attention shifts. Experiment evaluation results via benchmark databases and real-time video showed the model demonstrated high robustness and real-time ability under complex dynamic scenes.