Relevance of a feed-forward model of visual attention for goal-oriented and free-viewing tasks

Authors:
Olivier Le Meur;Jean-Claude Chevet
Affiliations:
ESIR, University of Rennes 1, IRISA, TEMICS, Rennes, France;Technicolor R&D France, Cesson-Sévigné, France
Venue:
IEEE Transactions on Image Processing
Year:
2010

Citing 7
Cited 3

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Learning of Finite Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Coherent Computational Approach to Model Bottom-Up Visual Attention

IEEE Transactions on Pattern Analysis and Machine Intelligence
2006 Special Issue: Modeling attention to salient proto-objects

Neural Networks
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
Assessing the contribution of color in visual attention

Computer Vision and Image Understanding - Special issue: Attention and performance in computer vision
Linear vs. nonlinear feature combination for saliency computation: a comparison with human vision

DAGM'06 Proceedings of the 28th conference on Pattern Recognition

A saliency map based on sampling an image into random rectangular regions of interest

Pattern Recognition
Depth matters: influence of depth cues on visual saliency

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Culture influence on aesthetic perception of Chinese and western paintings: evidence from eye movement patterns

Proceedings of the 6th International Symposium on Visual Information Communication and Interaction

Quantified Score

Hi-index	0.02

Visualization

Abstract

A purely bottom-up model of visual attention is proposed and compared to five state-of-the-art models. The role of the low-level visual features is examined in two contexts. Two datasets are used: one containing data coming from an eye tracking experiment obtained in a free-viewing task and a second containing 5000 hand-label pictures (observers had to enclose the most visually interesting objects in a rectangle). The relevance of the bottom-up models, i.e., the ability of a model to predict where the salient areas are located, is evaluated. Whatever the metrics and the datasets, the degree of similarity between predictions and ground truth is significantly above chance. The proposed model, resting on a small number of features, is shown to be a good predictor of the human visual fixations but also a good predictor of the objects chosen as interesting by observers. This study suggests that the low-level visual features have a significant role in a free-viewing task but also in a high-level visual task, such as the choice of the object of interest in a complex visual scene. Another outcome concerns the viewing duration used in eye tracking experiments. Results suggest that this parameter is finally not as critical as one would expect.