Fusing object detection and region appearance for image-text alignment

  • Authors:
  • Luca Del Pero;Philip Lee;James Magahern;Emily Hartley;Kobus Barnard;Ping Wang;Atul Kanaujia;Niels Haering

  • Affiliations:
  • University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;ObjectVideo, Reston, VA, USA;ObjectVideo, Reston, VA, USA;ObjectVideo, Reston, VA, USA

  • Venue:
  • MM '11 Proceedings of the 19th ACM international conference on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method for automatically aligning words to image regions that integrates specific object classifiers (e.g., "car" detectors) with weak models based on appearance features. Previous strategies have largely focused on the latter, and thus have not exploited progress on object category recognition. Hence, we augment region labeling with object detection, which simplifies the problem by reliably identifying a subset of the labels, and thereby reducing correspondence ambiguity overall. Comprehensive testing on the SAIAPR TC dataset shows that principled integration of object detection improves the region labeling task.