What's in a gaze?: the role of eye-gaze in reference resolution in multimodal conversational interfaces

Authors:
Zahar Prasov;Joyce Y. Chai
Affiliations:
Michigan State University, East Lansing, MI;Michigan State University, East Lansing, MI
Venue:
Proceedings of the 13th international conference on Intelligent user interfaces
Year:
2008

Citing 14
Cited 21

Eye tracking in advanced interface design

Virtual environments and advanced interface design
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Where is "it"? Event Synchronization in Gaze-Speech Input Systems

Proceedings of the 5th international conference on Multimodal interfaces
Using eye movements to determine referents in a spoken dialogue system

Proceedings of the 2001 workshop on Perceptive user interfaces
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Visual Salience and Reference Resolution in Simulated 3-D Environments

Artificial Intelligence Review
Linguistic theories in efficient multimodal reference resolution: an empirical investigation

Proceedings of the 10th international conference on Intelligent user interfaces
Conversing with the user based on eye-gaze patterns

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Resolving pronominal reference to abstract entities

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Cognitive principles in robust multimodal interpretation

Journal of Artificial Intelligence Research
RealTourist: a study of augmenting human-human and human-computer dialogue with eye-gaze overlay

INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction
Utilizing visual attention for cross-modal coreference interpretation

CONTEXT'05 Proceedings of the 5th international conference on Modeling and Using Context

Incorporating temporal and semantic information with eye gaze for automatic word acquisition in multimodal conversational systems

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Between linguistic attention and gaze fixations inmultimodal conversational interfaces

Proceedings of the 2009 international conference on Multimodal interfaces
The role of interactivity in human-machine conversation for automatic word acquisition

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Estimating user's engagement from eye-gaze behaviors in human-agent conversations

Proceedings of the 15th international conference on Intelligent user interfaces
Robust spoken instruction understanding for HRI

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Incorporating extra-linguistic information into reference resolution in collaborative task dialogue

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Context-based word acquisition for situated dialogue in a virtual world

Journal of Artificial Intelligence Research
Fusing eye gaze with speech recognition hypotheses to resolve exophoric references in situated dialogue

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
See what i'm saying?: using Dyadic Mobile Eye tracking to study collaborative reference

Proceedings of the ACM 2011 conference on Computer supported cooperative work
Mutual information as a variable to differentiate the roles of gaze in the multimodal interface

Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interaction
Interacting with a gaze-aware virtual character

Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interaction
Integrating domain knowledge with user eye gaze in automated word acquisition for conversational interfaces

Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interaction
Understanding student attention to adaptive hints with eye-tracking

UMAP'11 Proceedings of the 19th international conference on Advances in User Modeling
An analysis of attention to student --- adaptive hints in an educational game

ITS'12 Proceedings of the 11th international conference on Intelligent Tutoring Systems
Integrating word acquisition and referential grounding towards physical world interaction

Proceedings of the 14th ACM international conference on Multimodal interaction
Towards mediating shared perceptual basis in situated dialogue

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
REX-J: Japanese referring expression corpus of situated dialogs

Language Resources and Evaluation
Gaze awareness in conversational agents: Estimating a user's conversational engagement from eye gaze

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on interaction with smart objects, Special section on eye gaze and conversation
Gaze and turn-taking behavior in casual conversational interactions

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on interaction with smart objects, Special section on eye gaze and conversation
Mutual disambiguation of eye gaze and speech for sight translation and reading

Proceedings of the 6th workshop on Eye gaze in intelligent human machine interaction: gaze in multimodal interaction
A mixed reality head-mounted text translation system using eye gaze input

Proceedings of the 19th international conference on Intelligent User Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimodal conversational interfaces allow users to carry a dialog with a graphical display using speech to accomplish a particular task. Motivated by previous psycholinguistic findings, we examine how eye-gaze contributes to reference resolution in such a setting. Specifically, we present an integrated probabilistic framework that combines speech and eye-gaze for reference resolution. We further examine the relationship between eye-gaze and increased domain modeling with corresponding linguistic processing. Our empirical results show that the incorporation of eye-gaze significantly improves reference resolution performance. This improvement is most dramatic when a simple domain model is used. Our results also show that minimal domain modeling combined with eye-gaze significantly outperforms complex domain modeling without eye-gaze, which indicates that eye-gaze can be used to potentially compensate a lack of domain modeling for reference resolution.