Class-based n-gram models of natural language
Computational Linguistics
Automatic referent resolution of deictic and anaphoric expressions
Computational Linguistics
An algorithm for pronominal anaphora resolution
Computational Linguistics
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A salience driven approach to robust input interpretation in multimodal conversational systems
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Sphinx-4: a flexible open source framework for speech recognition
Sphinx-4: a flexible open source framework for speech recognition
A model for multimodal representation and processing for reference resolution
Proceedings of the 2007 workshop on Multimodal interfaces in semantic interaction
Proceedings of the 13th international conference on Intelligent user interfaces
An integrative recognition method for speech and gestures
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Knowledge and data flow architecture for reference processing in multimodal dialog systems
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Modeling and Using Salience in Multimodal Interaction Systems
Proceedings of the 13th International Conference on Human-Computer Interaction. Part II: Novel Interaction Methods and Techniques
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
The role of interactivity in human-machine conversation for automatic word acquisition
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions
Proceedings of the 15th international conference on Intelligent user interfaces
The recognition and comprehension of hand gestures: a review and research agenda
ZiF'06 Proceedings of the Embodied communication in humans and machines, 2nd ZiF research group international conference on Modeling communication with robots and virtual humans
Utilizing gestures to improve sentence boundary detection
Multimedia Tools and Applications
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
Previous studies have shown that, in multimodal conversational systems, fusing information from multiple modalities together can improve the overall input interpretation through mutual disambiguation. Inspired by these findings, this paper investigates non-verbal modalities, in particular deictic gesture, in spoken language processing. Our assumption is that during multimodal conversation, user's deictic gestures on the graphic display can signal the underlying domain model that is salient at that particular point of interaction. This salient domain model can be used to constrain hypotheses for spoken language processing. Based on this assumption, this paper examines different configurations of salience driven language models (e.g., n-gram and probabilistic context free grammar) for spoken language processing across different stages. Our empirical results have shown the potential of integrating salience models based on non-verbal modalities in spoken language understanding.