Making computers look the way we look: exploiting visual attention for image understanding

Authors:
Harish Katti;Ramanathan Subramanian;Mohan Kankanhalli;Nicu Sebe;Tat-Seng Chua;Kalpathi R. Ramakrishnan
Affiliations:
National University of Singapore , Singapore, Singapore;University of Trento, Trento, Italy;National University of Singapore , Singapore, Singapore;University of Trento, Trento, Italy;National University of Singapore , Singapore, Singapore;Indian Institute of Science, Bangalore, India
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 9
Cited 1

Robust Real-Time Face Detection

International Journal of Computer Vision
Gaze-based interaction for semi-automatic photo cropping

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Towards efficient context-specific video coding based on gaze-tracking analysis

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Some Objects Are More Equal Than Others: Measuring and Predicting Importance

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
What do you see when you're surfing?: using eye tracking to predict salient regions of web pages

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis

MM '09 Proceedings of the 17th ACM international conference on Multimedia
In the Eye of the Beholder: A Survey of Models for Eyes and Gaze

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
An eye fixation database for saliency detection in images

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV

Eye-tracking methodology and applications to images and video

MM '11 Proceedings of the 19th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Human Visual attention (HVA) is an important strategy to focus on specific information while observing and understanding visual stimuli. HVA involves making a series of fixations on select locations while performing tasks such as object recognition, scene understanding, etc. We present one of the first works that combines fixation information with automated concept detectors to (i) infer abstract image semantics, and (ii) enhance performance of object detectors. We develop visual attention-based models that sample fixation distributions and fixation transition distributions in regions-of-interest (ROI) to infer abstract semantics such as expressive faces and interactions (such as look, read, etc.). We also exploit eye-gaze information to deduce possible locations and scale of salient concepts and aid state-of-art detectors. A 18% performance increase with over 80% reduction in computational time for a state-of-art object detector [4].