Resolving ambiguities of a gaze and speech interface

  • Authors:
  • Qiaohui Zhang;Atsumi Imamiya;Kentaro Go;Xiaoyang Mao

  • Affiliations:
  • Department of Compute and Media Engineering, University of Yamanashi;Department of Compute and Media Engineering, University of Yamanashi;Center for Integrated Information Processing, University of Yamanashi;Department of Compute and Media Engineering, University of Yamanashi

  • Venue:
  • Proceedings of the 2004 symposium on Eye tracking research & applications
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recognition ambiguity of a recognition-based user interface is inevitable. Multimodal architecture should be an effective means to reduce the ambiguity, and contribute to error avoidance and recovery, compared with a unimodal one. But does the multimodal architecture always perform better than the unimode at any time? If not, when does it perform better than unimode, and when is it the optimum? Furthermore, how can modalities best be combined to gain the advantage of synergy? Little is known about these issues in the literature available. In this paper we try to give the answer through analyzing integration strategies for gaze and speech modalities, together with an evaluation experiment verifying these analyses. The approach involves studying the mutual correction cases and investigating when the mutual correction phenomena will occur. The goal of this study is to gain insights into integration strategies, and develop an optimum system to make error-prone recognition technologies perform at a more stable and robust level within a multimodal architecture.