Every picture tells a story: generating sentences from images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Semi-supervised modeling for prenominal modifier ordering
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Composing simple image descriptions using web-scale n-grams
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Corpus-guided sentence generation of natural images
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Baby talk: Understanding and generating simple image descriptions
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Hi-index | 0.00 |
We demonstrate a novel, robust vision-to-language generation system called Midge. Midge is a prototype system that connects computer vision to syntactic structures with semantic constraints, allowing for the automatic generation of detailed image descriptions. We explain how to connect vision detections to trees in Penn Treebank syntax, which provides the scaffolding necessary to further refine data-driven statistical generation approaches for a variety of end goals.