Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
The Journal of Machine Learning Research
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
HLT '93 Proceedings of the workshop on Human Language Technology
Word sense disambiguation with pictures
HLT-NAACL-LWM '04 Proceedings of the HLT-NAACL 2003 workshop on Learning word meaning from non-linguistic data - Volume 6
Discriminating image senses by clustering with multimodal features
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
SemEval-2007 task 17: English lexical sample, SRL and all words
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
UMND1: unsupervised word sense disambiguation using contextual semantic relatedness
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Word sense disambiguation with pictures
Artificial Intelligence - Special volume on connecting language to the world
Using Language to Learn Structured Appearance Models for Image Annotation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Topic models for image annotation and text illustration
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Given a set of images with related captions, our goal is to show how visual features can improve the accuracy of unsupervised word sense disambiguation when the textual context is very small, as this sort of data is common in news and social media. We extend previous work in unsupervised text-only disambiguation with methods that integrate text and images. We construct a corpus by using Amazon Mechanical Turk to caption sense-tagged images gathered from ImageNet. Using a Yarowsky-inspired algorithm, we show that gains can be made over text-only disambiguation, as well as multimodal approaches such as Latent Dirichlet Allocation.