Unsupervised disambiguation of image captions

Authors:
Wesley May;Sanja Fidler;Afsaneh Fazly;Sven Dickinson;Suzanne Stevenson
Affiliations:
University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada;University of Toronto Toronto, Ontario, Canada
Venue:
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Year:
2012

Citing 12
Cited 0

Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Latent dirichlet allocation

The Journal of Machine Learning Research
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
A semantic concordance

HLT '93 Proceedings of the workshop on Human Language Technology
Word sense disambiguation with pictures

HLT-NAACL-LWM '04 Proceedings of the HLT-NAACL 2003 workshop on Learning word meaning from non-linguistic data - Volume 6
Discriminating image senses by clustering with multimodal features

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
SemEval-2007 task 17: English lexical sample, SRL and all words

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
UMND1: unsupervised word sense disambiguation using contextual semantic relatedness

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Word sense disambiguation with pictures

Artificial Intelligence - Special volume on connecting language to the world
Using Language to Learn Structured Appearance Models for Image Annotation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Topic models for image annotation and text illustration

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a set of images with related captions, our goal is to show how visual features can improve the accuracy of unsupervised word sense disambiguation when the textual context is very small, as this sort of data is common in news and social media. We extend previous work in unsupervised text-only disambiguation with methods that integrate text and images. We construct a corpus by using Amazon Mechanical Turk to caption sense-tagged images gathered from ImageNet. Using a Yarowsky-inspired algorithm, we show that gains can be made over text-only disambiguation, as well as multimodal approaches such as Latent Dirichlet Allocation.