Cross-caption coreference resolution for automatic image understanding

  • Authors:
  • Micah Hodosh;Peter Young;Cyrus Rashtchian;Julia Hockenmaier

  • Affiliations:
  • University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign

  • Venue:
  • CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent work in computer vision has aimed to associate image regions with keywords describing the depicted entities, but actual image 'understanding' would also require identifying their attributes, relations and activities. Since this information cannot be conveyed by simple keywords, we have collected a corpus of "action" photos each associated with five descriptive captions. In order to obtain a consistent semantic representation for each image, we need to first identify which NPs refer to the same entities. We present three hierarchical Bayesian models for cross-caption coreference resolution. We have also created a simple ontology of entity classes that appear in images and evaluate how well these can be recovered.