Midge: generating descriptions of images

Authors:
Margaret Mitchell;Xufeng Han;Jeff Hayes
Affiliations:
University of Aberdeen;Stony Brook University;SignWorks of Oregon
Venue:
INLG '12 Proceedings of the Seventh International Natural Language Generation Conference
Year:
2012

Citing 5
Cited 0

Every picture tells a story: generating sentences from images

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Semi-supervised modeling for prenominal modifier ordering

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Composing simple image descriptions using web-scale n-grams

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Corpus-guided sentence generation of natural images

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Baby talk: Understanding and generating simple image descriptions

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We demonstrate a novel, robust vision-to-language generation system called Midge. Midge is a prototype system that connects computer vision to syntactic structures with semantic constraints, allowing for the automatic generation of detailed image descriptions. We explain how to connect vision detections to trees in Penn Treebank syntax, which provides the scaffolding necessary to further refine data-driven statistical generation approaches for a variety of end goals.