Natural language understanding (2nd ed.)
Natural language understanding (2nd ed.)
A computational model of color perception and color naming
A computational model of color perception and color naming
Integration and synchronization of input modes during multimodal human-computer interaction
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Karma: knowledge-based active representations for metaphor and aspect
Karma: knowledge-based active representations for metaphor and aspect
Augmenting user interfaces with adaptive speech commands
Proceedings of the 5th international conference on Multimodal interfaces
Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic
Journal of Artificial Intelligence Research
Ubiquitous talker: spoken language interaction with real world objects
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Augmenting user interfaces with adaptive speech commands
Proceedings of the 5th international conference on Multimodal interfaces
Engaging in a conversation with synthetic characters along the virtuality continuum
SG'05 Proceedings of the 5th international conference on Smart Graphics
Proceedings of the 14th ACM international conference on Multimodal interaction
Hi-index | 0.00 |
Many user interfaces, from graphic design programs to navigation aids in cars, share a virtual space with the user. Such applications are often ideal candidates for speech interfaces that allow the user to refer to objects in the shared space. We present an analysis of how people describe objects in spatial scenes using natural language. Based on this study, we describe a system that uses synthetic vision to "see" such scenes from the person's point of view, and that understands complex natural language descriptions referring to objects in the scenes. This system is based on a rich notion of semantic compositionality embedded in a grounded language understanding framework. We describe its semantic elements, their compositional behaviour, and their grounding through the synthetic vision system. To conclude, we evaluate the performance of the system on unconstrained input.