Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis

Authors:
Subramanian Ramanathan;Harish Katti;Raymond Huang;Tat-Seng Chua;Mohan Kankanhalli
Affiliations:
National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore
Venue:
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Year:
2009

Citing 5
Cited 7

Finding and Labeling the Subject of a Captioned Depictive Natural Photograph

IEEE Transactions on Knowledge and Data Engineering
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
The effects of semantic grouping on visual search

CHI '08 Extended Abstracts on Human Factors in Computing Systems
A multicue Bayesian state estimator for gaze prediction in open signed video

IEEE Transactions on Multimedia

Making computers look the way we look: exploiting visual attention for image understanding

Proceedings of the international conference on Multimedia
An eye fixation database for saliency detection in images

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Can computers learn from humans to see better?: inferring scene semantics from viewers' eye movements

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Eye-tracking methodology and applications to images and video

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Identifying objects in images from analyzing the users' gaze movements for provided tags

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Making use of eye tracking information in image collection creation and region annotation

Proceedings of the 20th ACM international conference on Multimedia
Tagging-by-search: automatic image region labeling using gaze information obtained from image search

Proceedings of the 19th international conference on Intelligent User Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel framework to localize and label affective objects and actions in images through a combination of text, visual and gaze-based analysis. Human gaze provides useful cues to infer locations and interactions of affective objects. While concepts (labels) associated with an image can be determined from its caption, we demonstrate localization of these concepts upon learning from a statistical affect model for world concepts. The affect model is derived from non-invasively acquired fixation patterns on labeled images, and guides localization of affective objects (faces, reptiles) and actions (look, read) from fixations in unlabeled images. Experimental results obtained on a database of 500 images confirm the effectiveness and promise of the proposed approach.