Recognizing Surfaces Using Three-Dimensional Textons
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Constraint-based sentence compression an integer programming approach
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Summarization with a joint model for sentence extraction and compression
ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Object Detection with Discriminatively Trained Part-Based Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic generation of story highlights
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
How many words is a picture worth? Automatic caption generation for news images
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Generating image descriptions using dependency relational patterns
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Title generation with quasi-synchronous grammar
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Every picture tells a story: generating sentences from images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Composing simple image descriptions using web-scale n-grams
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Systematically grounding language through vision in a deep, recurrent neural network
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Corpus-guided sentence generation of natural images
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Baby talk: Understanding and generating simple image descriptions
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
From image annotation to image description
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
Automatic image description by using word-level features
Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
A multimodal framework for unsupervised feature fusion
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Framing image description as a ranking task: data, models and evaluation metrics
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
We present a holistic data-driven approach to image description generation, exploiting the vast amount of (noisy) parallel image data and associated natural language descriptions available on the web. More specifically, given a query image, we retrieve existing human-composed phrases used to describe visually similar images, then selectively combine those phrases to generate a novel description for the query image. We cast the generation process as constraint optimization problems, collectively incorporating multiple interconnected aspects of language composition for content planning, surface realization and discourse structure. Evaluation by human annotators indicates that our final system generates more semantically correct and linguistically appealing descriptions than two nontrivial baselines.