Every picture tells a story: generating sentences from images

  • Authors:
  • Ali Farhadi;Mohsen Hejrati;Mohammad Amin Sadeghi;Peter Young;Cyrus Rashtchian;Julia Hockenmaier;David Forsyth

  • Affiliations:
  • Computer Science Department, University of Illinois at Urbana-Champaign;Computer Vision Group, School of Mathematics, Institute for studies in theoretical Physics and Mathematics;Computer Vision Group, School of Mathematics, Institute for studies in theoretical Physics and Mathematics;Computer Science Department, University of Illinois at Urbana-Champaign;Computer Science Department, University of Illinois at Urbana-Champaign;Computer Science Department, University of Illinois at Urbana-Champaign;Computer Science Department, University of Illinois at Urbana-Champaign

  • Venue:
  • ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Humans can prepare concise descriptions of pictures, focusing on what they find important. We demonstrate that automatic methods can do so too. We describe a system that can compute a score linking an image to a sentence. This score can be used to attach a descriptive sentence to a given image, or to obtain images that illustrate a given sentence. The score is obtained by comparing an estimate of meaning obtained from the image to one obtained from the sentence. Each estimate of meaning comes from a discriminative procedure that is learned us-ingdata. We evaluate on a novel dataset consisting of human-annotated images. While our underlying estimate of meaning is impoverished, it is sufficient to produce very good quantitative results, evaluated with a novel score that can account for synecdoche.