A visually grounded natural language interface for reference to spatial scenes
Proceedings of the 5th international conference on Multimodal interfaces
Signal Processing - Special section: Multimodal human-computer interfaces
Speech-filtered bubble ray: improving target acquisition on display walls
Proceedings of the 9th international conference on Multimodal interfaces
A Probabilistic Approach to the Interpretation of Spoken Utterances
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
A Probabilistic Model for Understanding Composite Spoken Descriptions
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Hi-index | 0.00 |
Interaction with virtual 3D environments comes with a host of challenges. For instance, because 3D objects tend to occlude one another, performing object selection by pointing gestures is problematic, and more so when there are many objects in the scene. In the real world we tend to use speech to clarify our intent, by referring to distinctive attributes of the object and/or its absolute or relative location in space. Multimodal interactive systems involving speech and gesture have generally relied on speech for commands and deictic gestures for indicating the target object. In this paper, we present a system which allows object references to be made using gestures and speech, and supports a variety of expressions inspired by real-world usage.