Learning Visual Compound Models from Parallel Image-Text Datasets

  • Authors:
  • Jan Moringen;Sven Wachsmuth;Sven Dickinson;Suzanne Stevenson

  • Affiliations:
  • Bielefeld University,;Bielefeld University,;University of Toronto,;University of Toronto,

  • Venue:
  • Proceedings of the 30th DAGM symposium on Pattern Recognition
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a new approach to learn structured visual compound models from shape-based feature descriptions. We use captioned text in order to drive the process of grouping boundary fragments detected in an image. In the learning framework, we transfer several techniques from computational linguistics to the visual domain and build on previous work in image annotation. A statistical translation model is used in order to establish links between caption words and image elements. Then, compounds are iteratively built up by using a mutual information measure. Relations between compound elements are automatically extracted and increase the discriminability of the visual models. We show results on different synthetic and realistic datasets in order to validate our approach.