Composing simple image descriptions using web-scale n-grams

  • Authors:
  • Siming Li;Girish Kulkarni;Tamara L. Berg;Alexander C. Berg;Yejin Choi

  • Affiliations:
  • Stony Brook University, NY;Stony Brook University, NY;Stony Brook University, NY;Stony Brook University, NY;Stony Brook University, NY

  • Venue:
  • CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Studying natural language, and especially how people describe the world around them can help us better understand the visual world. In turn, it can also help us in the quest to generate natural language that describes this world in a human manner. We present a simple yet effective approach to automatically compose image descriptions given computer vision based inputs and using web-scale n-grams. Unlike most previous work that summarizes or retrieves pre-existing text relevant to an image, our method composes sentences entirely from scratch. Experimental results indicate that it is viable to generate simple textual descriptions that are pertinent to the specific content of an image, while permitting creativity in the description -- making for more human-like annotations than previous approaches.