Discovering multipart appearance models from captioned images

Authors:
Michael Jamieson;Yulia Eskin;Afsaneh Fazly;Suzanne Stevenson;Sven Dickinson
Affiliations:
University of Toronto;University of Toronto;University of Toronto;University of Toronto;University of Toronto
Venue:
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Year:
2010

Citing 12
Cited 0

Matching words and pictures

The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Hierarchical Part-Based Visual Object Categorization

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Feature Hierarchies for Object Classification

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Learning Object Categories from Google"s Image Search

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling Semantic Aspects for Cross-Media Image Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Using Language to Learn Structured Appearance Models for Image Annotation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning the Compositional Nature of Visual Object Categories for Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Weakly supervised learning of part-based spatial models for visual object recognition

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Even a relatively unstructured captioned image set depicting a variety of objects in cluttered scenes contains strong correlations between caption words and repeated visual structures. We exploit these correlations to discover named objects and learn hierarchical models of their appearance. Revising and extending a previous technique for finding small, distinctive configurations of local features, our method assembles these co-occurring parts into graphs with greater spatial extent and flexibility. The resulting multipart appearance models remain scale, translation and rotation invariant, but are more reliable detectors and provide better localization. We demonstrate improved annotation precision and recall on datasets to which the non-hierarchical technique was previously applied and show extended spatial coverage of detected objects.