Unsupervised disambiguation of image captions

  • Authors:
  • Wesley May;Sanja Fidler;Afsaneh Fazly;Sven Dickinson;Suzanne Stevenson

  • Affiliations:
  • University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada

  • Venue:
  • SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of images with related captions, our goal is to show how visual features can improve the accuracy of unsupervised word sense disambiguation when the textual context is very small, as this sort of data is common in news and social media. We extend previous work in unsupervised text-only disambiguation with methods that integrate text and images. We construct a corpus by using Amazon Mechanical Turk to caption sense-tagged images gathered from ImageNet. Using a Yarowsky-inspired algorithm, we show that gains can be made over text-only disambiguation, as well as multimodal approaches such as Latent Dirichlet Allocation.