Attention, intentions, and the structure of discourse
Computational Linguistics
C4.5: programs for machine learning
C4.5: programs for machine learning
Automatic referent resolution of deictic and anaphoric expressions
Computational Linguistics
Machine Learning
QuickSet: multimodal interaction for distributed applications
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Natural language with integrated deictic and graphic gestures
Readings in intelligent user interfaces
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
ACM Transactions on Computer-Human Interaction (TOCHI)
Conversing with the user based on eye-gaze patterns
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
MATCH: an architecture for multimodal dialogue systems
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Journal of Intelligent and Robotic Systems
Salience modeling based on non-verbal modalities for spoken language understanding
Proceedings of the 8th international conference on Multimodal interfaces
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A salience driven approach to robust input interpretation in multimodal conversational systems
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Multimodal interactive maps: designing for human performance
Human-Computer Interaction
Cognitive principles in robust multimodal interpretation
Journal of Artificial Intelligence Research
Sphinx-4: a flexible open source framework for speech recognition
Sphinx-4: a flexible open source framework for speech recognition
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Improving pronominal and deictic co-reference resolution with multi-modal features
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
In a multimodal conversational interface supporting speech and deictic gesture, deictic gestures on the graphical display have been traditionally used to identify user attention, for example, through reference resolution. Since the context of the identified attention can potentially constrain the associated intention, our hypothesis is that deictic gestures can go beyond attention and apply to intention recognition. Driven by this assumption, this paper systematically investigates the role of deictic gestures in intention recognition. We experiment with different model-based methods and instancebased methods to incorporate gestural information for intention recognition. We examine the effects of utilizing gestural information in two different processing stages: speech recognition stage and language understanding stage. Our empirical results have shown that utilizing gestural information improves intention recognition. The performance is further improved when gestures are incorporated in both speech recognition and language understanding stages compared to either stage alone.