Attention, intentions, and the structure of discourse
Computational Linguistics
Automatic referent resolution of deictic and anaphoric expressions
Computational Linguistics
Multimodal interfaces for dynamic interactive maps
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Integration and synchronization of input modes during multimodal human-computer interaction
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
QuickSet: multimodal interaction for distributed applications
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
User and discourse models for multimodal communication
Readings in intelligent user interfaces
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Multimodal system processing in mobile environments
UIST '00 Proceedings of the 13th annual ACM symposium on User interface software and technology
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Context-Based Multimodal Input Understanding in Conversational Systems
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
The pragmatics of referring and the modality of communication
Computational Linguistics
Unification-based multimodal integration
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Using redundant speech and handwriting for learning new vocabulary and understanding abbreviations
Proceedings of the 8th international conference on Multimodal interfaces
A salience driven approach to robust input interpretation in multimodal conversational systems
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Modeling the impact of shared visual information on collaborative reference
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Proceedings of the 13th international conference on Intelligent user interfaces
Be Brief, And They Shall Learn: Generating Concise Language Feedback for a Computer Tutor
International Journal of Artificial Intelligence in Education
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Cognitive principles in robust multimodal interpretation
Journal of Artificial Intelligence Research
See what i'm saying?: using Dyadic Mobile Eye tracking to study collaborative reference
Proceedings of the ACM 2011 conference on Computer supported cooperative work
A multimodal reference resolution approach in virtual environment
VSMM'06 Proceedings of the 12th international conference on Interactive Technologies and Sociotechnical Systems
Hi-index | 0.00 |
Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech, gesture, and gaze. To build effective multimodal interfaces, understanding user multimodal inputs is important. Previous linguistic and cognitive studies indicate that user language behavior does not occur randomly, but rather follows certain linguistic and cognitive principles. Therefore, this paper investigates the use of linguistic theories in multimodal interpretation. In particular, we present a greedy algorithm that incorporates Conversation Implicature and Givenness Hierarchy for efficient multimodal reference resolution. Empirical studies indicate that this algorithm significantly reduces the complexity in multimodal reference resolution compared to a previous graph-matching approach. One major advantage of this greedy algorithm is that the prior linguistic and cognitive knowledge can be used to guide the search and significantly prune the search space. Because of its simplicity and generality, this approach has the potential to improve the robustness of interpretation and provide a more practical solution to multimodal input interpretation.