Resolving ambiguities of a gaze and speech interface

Authors:
Qiaohui Zhang;Atsumi Imamiya;Kentaro Go;Xiaoyang Mao
Affiliations:
Department of Compute and Media Engineering, University of Yamanashi;Department of Compute and Media Engineering, University of Yamanashi;Center for Integrated Information Processing, University of Yamanashi;Department of Compute and Media Engineering, University of Yamanashi
Venue:
Proceedings of the 2004 symposium on Eye tracking research & applications
Year:
2004

Citing 10
Cited 5

Integrating simultaneous input from speech, gaze, and hand gestures

Intelligent multimedia interfaces
Eye tracking in advanced interface design

Virtual environments and advanced interface design
Natural language with integrated deictic and graphic gestures

Readings in intelligent user interfaces
A robust selection system using real-time multi-modal user-agent interactions

IUI '99 Proceedings of the 4th international conference on Intelligent user interfaces
Patterns of entry and correction in large vocabulary continuous speech recognition systems

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Providing integrated toolkit-level support for ambiguity in recognition-based interfaces

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Design issues of iDICT: a gaze-assisted translation aid

ETRA '00 Proceedings of the 2000 symposium on Eye tracking research & applications
“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Using eye movements to determine referents in a spoken dialogue system

Proceedings of the 2001 workshop on Perceptive user interfaces

Robust object-identification from inaccurate recognition-based inputs

Proceedings of the working conference on Advanced visual interfaces
Speech-augmented eye gaze interaction with small closely spaced targets

Proceedings of the 2006 symposium on Eye tracking research & applications
Improving eye cursor's stability for eye pointing tasks

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The MAGIC Touch: Combining MAGIC-Pointing with a Touch-Sensitive Mouse

INTERACT '09 Proceedings of the 12th IFIP TC 13 International Conference on Human-Computer Interaction: Part II
Speed-accuracy trade-off in dwell-based eye pointing tasks at different cognitive levels

Proceedings of the 1st international workshop on pervasive eye tracking & mobile eye-based interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

The recognition ambiguity of a recognition-based user interface is inevitable. Multimodal architecture should be an effective means to reduce the ambiguity, and contribute to error avoidance and recovery, compared with a unimodal one. But does the multimodal architecture always perform better than the unimode at any time? If not, when does it perform better than unimode, and when is it the optimum? Furthermore, how can modalities best be combined to gain the advantage of synergy? Little is known about these issues in the literature available. In this paper we try to give the answer through analyzing integration strategies for gaze and speech modalities, together with an evaluation experiment verifying these analyses. The approach involves studying the mutual correction cases and investigating when the mutual correction phenomena will occur. The goal of this study is to gain insights into integration strategies, and develop an optimum system to make error-prone recognition technologies perform at a more stable and robust level within a multimodal architecture.