Learning to sportscast: a test of grounded language acquisition
Proceedings of the 25th international conference on Machine learning
On the integration of grounding language and learning objects
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Grounded semantic composition for visual scenes
Journal of Artificial Intelligence Research
Grounding spatial prepositions for video search
Proceedings of the 2009 international conference on Multimodal interfaces
Reinforcement learning for mapping instructions to actions
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Learning semantic correspondences with less supervision
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Reading to learn: constructing features from semantic abstracts
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Toward understanding natural language directions
Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Corpus-guided sentence generation of natural images
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
Synergistic methods for using language in robotics
Proceedings of the Workshop on Performance Metrics for Intelligent Systems
Hi-index | 0.00 |
Language is sensitive to both semantic and pragmatic effects. To capture both effects, we model language use as a cooperative game between two players: a speaker, who generates an utterance, and a listener, who responds with an action. Specifically, we consider the task of generating spatial references to objects, wherein the listener must accurately identify an object described by the speaker. We show that a speaker model that acts optimally with respect to an explicit, embedded listener model substantially outperforms one that is trained to directly generate spatial descriptions.