Fusing object detection and region appearance for image-text alignment

Authors:
Luca Del Pero;Philip Lee;James Magahern;Emily Hartley;Kobus Barnard;Ping Wang;Atul Kanaujia;Niels Haering
Affiliations:
University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;University of Arizona, Tucson, AZ, USA;ObjectVideo, Reston, VA, USA;ObjectVideo, Reston, VA, USA;ObjectVideo, Reston, VA, USA
Venue:
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Year:
2011

Citing 10
Cited 0

Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Matching words and pictures

The Journal of Machine Learning Research
Image annotations by combining multiple evidence & wordNet

Proceedings of the 13th annual ACM international conference on Multimedia
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Evaluation of Localized Semantics: Data, Methodology, and Experiments

International Journal of Computer Vision
Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Image annotation with tagprop on the MIRFLICKR set

Proceedings of the international conference on Multimedia information retrieval
The segmented and annotated IAPR TC-12 benchmark

Computer Vision and Image Understanding
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Localizing objects while learning their appearance

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method for automatically aligning words to image regions that integrates specific object classifiers (e.g., "car" detectors) with weak models based on appearance features. Previous strategies have largely focused on the latter, and thus have not exploited progress on object category recognition. Hence, we augment region labeling with object detection, which simplifies the problem by reliably identifying a subset of the labels, and thereby reducing correspondence ambiguity overall. Comprehensive testing on the SAIAPR TC dataset shows that principled integration of object detection improves the region labeling task.