Efficient image annotation for automatic sentence generation

  • Authors:
  • Yoshitaka Ushiku;Tatsuya Harada;Yasuo Kuniyoshi

  • Affiliations:
  • The University of Tokyo, Tokyo, Japan;The University of Tokyo & JST PRESTO, Tokyo, Japan;The University of Tokyo, Tokyo, Japan

  • Venue:
  • Proceedings of the 20th ACM international conference on Multimedia
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentence generation from images is an ultimate goal of image recognition. In this paper, we attack a novel problem, the "multi-keyphrase problem", to address this goal. We hypothesize that image contents can be described with multi-keyphrases, and that a natural sentence can be generated by connecting multi-keyphrases with an experimental grammar model. Existing methods require semantic knowledge such as labels of an object, action, or scene. Using these methods, we must strive to prepare a highly organized dataset. Therefore, we propose a novel online learning method for multi-keyphrase estimation. The proposed framework, although simple and scalable, can generate sentences from images with no semantic knowledge. Moreover, the proposed method for multi-keyphrase estimation is applicable to image annotation, and it achieves state-of-the-art performance. Our experiment using only images and texts demonstrates that the proposed framework is useful for sentence generation from images.