Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
International Journal of Computer Vision
Automatic image annotation and retrieval using cross-media relevance models
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
HLT '02 Proceedings of the second international conference on Human Language Technology Research
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
A New Baseline for Image Annotation
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Scene Discovery by Matrix Factorization
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part IV
How many words is a picture worth? Automatic caption generation for news images
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Generating image descriptions using dependency relational patterns
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Every picture tells a story: generating sentences from images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Improving the fisher kernel for large-scale image classification
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Composing simple image descriptions using web-scale n-grams
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Holistic Feature Extraction for Automatic Image Annotation
MUE '11 Proceedings of the 2011 Fifth FTRA International Conference on Multimedia and Ubiquitous Engineering
Automatic sentence generation from images
MM '11 Proceedings of the 19th ACM international conference on Multimedia
A discriminative approach for the retrieval of images from text queries
ECML'06 Proceedings of the 17th European conference on Machine Learning
Corpus-guided sentence generation of natural images
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Recognition using visual phrases
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Baby talk: Understanding and generating simple image descriptions
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
WSABIE: scaling up to large vocabulary image annotation
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Exploiting language models to recognize unseen actions
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Hi-index | 0.00 |
Sentence generation from images is an ultimate goal of image recognition. In this paper, we attack a novel problem, the "multi-keyphrase problem", to address this goal. We hypothesize that image contents can be described with multi-keyphrases, and that a natural sentence can be generated by connecting multi-keyphrases with an experimental grammar model. Existing methods require semantic knowledge such as labels of an object, action, or scene. Using these methods, we must strive to prepare a highly organized dataset. Therefore, we propose a novel online learning method for multi-keyphrase estimation. The proposed framework, although simple and scalable, can generate sentences from images with no semantic knowledge. Moreover, the proposed method for multi-keyphrase estimation is applicable to image annotation, and it achieves state-of-the-art performance. Our experiment using only images and texts demonstrates that the proposed framework is useful for sentence generation from images.