Linguistic theories in efficient multimodal reference resolution: an empirical investigation

Authors:
Joyce Y. Chai;Zahar Prasov;Joseph Blaim;Rong Jin
Affiliations:
Michigan State University, East Lansing, MI;Michigan State University, East Lansing, MI;Michigan State University, East Lansing, MI;Michigan State University, East Lansing, MI
Venue:
Proceedings of the 10th international conference on Intelligent user interfaces
Year:
2005

Citing 17
Cited 10

Attention, intentions, and the structure of discourse

Computational Linguistics
Automatic referent resolution of deictic and anaphoric expressions

Computational Linguistics
Multimodal interfaces for dynamic interactive maps

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Integration and synchronization of input modes during multimodal human-computer interaction

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
QuickSet: multimodal interaction for distributed applications

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
User and discourse models for multimodal communication

Readings in intelligent user interfaces
Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Multimodal system processing in mobile environments

UIST '00 Proceedings of the 13th annual ACM symposium on User interface software and technology
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Context-Based Multimodal Input Understanding in Conversational Systems

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A probabilistic approach to reference resolution in multimodal user interfaces

Proceedings of the 9th international conference on Intelligent user interfaces
The pragmatics of referring and the modality of communication

Computational Linguistics
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Performance evaluation and error analysis for multimodal reference resolution in a conversation system

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers

Using redundant speech and handwriting for learning new vocabulary and understanding abbreviations

Proceedings of the 8th international conference on Multimodal interfaces
A salience driven approach to robust input interpretation in multimodal conversational systems

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Multimodal redundancy across handwriting and speech during computer mediated human-human interactions

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Modeling the impact of shared visual information on collaborative reference

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
What's in a gaze?: the role of eye-gaze in reference resolution in multimodal conversational interfaces

Proceedings of the 13th international conference on Intelligent user interfaces
Be Brief, And They Shall Learn: Generating Concise Language Feedback for a Computer Tutor

International Journal of Artificial Intelligence in Education
What's there to talk about?: a multi-modal model of referring behavior in the presence of shared visual information

EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Cognitive principles in robust multimodal interpretation

Journal of Artificial Intelligence Research
See what i'm saying?: using Dyadic Mobile Eye tracking to study collaborative reference

Proceedings of the ACM 2011 conference on Computer supported cooperative work
A multimodal reference resolution approach in virtual environment

VSMM'06 Proceedings of the 12th international conference on Interactive Technologies and Sociotechnical Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimodal conversational interfaces provide a natural means for users to communicate with computer systems through multiple modalities such as speech, gesture, and gaze. To build effective multimodal interfaces, understanding user multimodal inputs is important. Previous linguistic and cognitive studies indicate that user language behavior does not occur randomly, but rather follows certain linguistic and cognitive principles. Therefore, this paper investigates the use of linguistic theories in multimodal interpretation. In particular, we present a greedy algorithm that incorporates Conversation Implicature and Givenness Hierarchy for efficient multimodal reference resolution. Empirical studies indicate that this algorithm significantly reduces the complexity in multimodal reference resolution compared to a previous graph-matching approach. One major advantage of this greedy algorithm is that the prior linguistic and cognitive knowledge can be used to guide the search and significantly prune the search space. Because of its simplicity and generality, this approach has the potential to improve the robustness of interpretation and provide a more practical solution to multimodal input interpretation.