Topic models for image annotation and text illustration

Authors:
Yansong Feng;Mirella Lapata
Affiliations:
University of Edinburgh, Edinburgh, UK;University of Edinburgh, Edinburgh, UK
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 18
Cited 15

Results and challenges in Web search evaluation

WWW '99 Proceedings of the eighth international conference on World Wide Web
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Experimental result analysis for a generative probabilistic image retrieval model

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
Matching words and pictures

The Journal of Machine Learning Research
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Probabilistic models of text and images

Probabilistic models of text and images
The Story Picturing Engine---a system for automatic text illustration

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Modeling Semantic Aspects for Cross-Media Image Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Scene Classification Using a Hybrid Generative/Discriminative Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Image classification for content-based indexing

IEEE Transactions on Image Processing
A Study of Quality Issues for Image Auto-Annotation With the Corel Dataset

IEEE Transactions on Circuits and Systems for Video Technology

How many words is a picture worth? Automatic caption generation for news images

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Title generation with quasi-synchronous grammar

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Automatic labelling of topic models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Composing simple image descriptions using web-scale n-grams

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Enriching textbooks with images

Proceedings of the 20th ACM international conference on Information and knowledge management
Learning bilingual lexicons using the visual similarity of labeled web images

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Recognising speakers from the topics they talk about

Speech Communication
Apples to oranges: evaluating image annotations from natural language processing systems

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Unsupervised disambiguation of image captions

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Computing similarity between cultural heritage items using multimodal features

LaTeCH '12 Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Visualizing timelines: evolutionary summarization via iterative reinforcement between text and image streams

Proceedings of the 21st ACM international conference on Information and knowledge management
A picture paints a thousand words: a method of generating image-text timelines

Proceedings of the 21st ACM international conference on Information and knowledge management
A cross-media evolutionary timeline generation framework based on iterative recommendation

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
A framework for automated construction of resource space based on background knowledge

Future Generation Computer Systems
Multimedia event detection with multimodal feature fusion and temporal concept localization

Machine Vision and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Image annotation, the task of automatically generating description words for a picture, is a key component in various image search and retrieval applications. Creating image databases for model development is, however, costly and time consuming, since the keywords must be hand-coded and the process repeated for new collections. In this work we exploit the vast resource of images and documents available on the web for developing image annotation models without any human involvement. We describe a probabilistic model based on the assumption that images and their co-occurring textual data are generated by mixtures of latent topics. We show that this model outperforms previously proposed approaches when applied to image annotation and the related task of text illustration despite the noisy nature of our dataset.