WordNet: a lexical database for English
Communications of the ACM
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Movie/Script: Alignment and Parsing of Video and Text Transcription
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part IV
Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
From Gestalt Theory to Image Analysis: A Probabilistic Approach
From Gestalt Theory to Image Analysis: A Probabilistic Approach
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
UCNLG+Sum '09 Proceedings of the 2009 Workshop on Language Generation and Summarisation
Following directions using statistical machine translation
Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
A game-theoretic approach to generating spatial descriptions
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Every picture tells a story: generating sentences from images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Approximate scalable bounded space sketch for large data NLP
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Corpus-guided sentence generation of natural images
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
This paper presents an overview of our work on integrating language with vision to endow robots with the ability of complex scene understanding. We propose and motivate the Vision-Action-Language loop as a form of cognitive dialogue that enables us to integrate current tools in linguistics, vision and AI. We present several experimental results of preliminary implementation and discuss future research directions that we view as crucial for developing the cognitive robots of the future.