Relevance: communication and cognition
Relevance: communication and cognition
Attention, intentions, and the structure of discourse
Computational Linguistics
Ten myths of multimodal interaction
Communications of the ACM
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Multimodal Cooperative Resolution of Referential Expressions in the DenK System
CMC '98 Revised Papers from the Second International Conference on Cooperative Multimodal Communication
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Referring to Objects with Spoken and Haptic Modalities
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Salience-driven Contextual Priming of Speech Recognition for Human-Robot Interaction
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Annotation schemes for verbal and non-verbal communication: some general issues
COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
Generating referring expressions with reference domain theory
INLG '10 Proceedings of the 6th International Natural Language Generation Conference
A salience-driven approach to speech recognition for human-robot interaction
ESSLLI'08/09 Proceedings of the 2008 international conference on Interfaces: explorations in logic, language and computation
Speech and 2d deictic gesture reference to virtual scenes
PIT'06 Proceedings of the 2006 international tutorial and research conference on Perception and Interactive Technologies
Proceedings of the 14th ACM international conference on Multimodal interaction
Hi-index | 0.00 |
The way we see the objects around us determines speech and gestures we use to refer to them. The gestures we produce structure our visual perception. The words we use have an influence on the way we see. In this manner, visual perception, language and gesture present multiple interactions between each other. The problem is global and has to be tackled as a whole in order to understand the complexity of reference phenomena and to deduce a formal model. This model may be useful for any kind of human-machine dialogue system that focuses on deep comprehension. We show how a referring act takes place in a contextual subset of objects. This subset is called 'reference domain' and is implicit. It can be deduced from a lot of clues. Among these clues are those which come from the visual context and those which come from the multimodal utterance. We present the 'multimodal reference domain' model that takes these clues into account and that can be exploited in a multimodal dialogue system when interpreting.