Advances in the robust processing of multimodal speech and pen systems

  • Authors:
  • Sharon Oviatt

  • Affiliations:
  • Center for Human-Computer Communication, Department of Computer Science & Engineering, Oregon Graduate Institute of Science & Technology, 20000 N. W. Walker Road, Beaverton, Oregon

  • Venue:
  • Multimodal interface for human-machine communication
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multimodal systems have developed rapidly during the past decade, with progress toward building more general and robust systems, as well as more transparent and usable human interfaces. These next-generation multimodal systems aim to improve the expressive power and efficiency of human interfaces, to expand the accessibility of computing for diverse and disabled users, to enhance the performance stability and robustness of recognition-based systems, and to support new forms of computing. In this chapter, we describe the QuickSet multimodal pen/voice system, including its functionality, interface design, natural language processing and fusion techniques, overall architecture, applications and performance. We also summarize results from two recent empirical studies with QuickSet in which its multimodal architecture is shown to decrease failures in spoken language processing by 19-41%. This performance improvement mainly is due to the mutual disambiguation of input signals that is possible within a multimodal architecture, which occurs at higher levels for challenging user groups (accented versus native speakers) and usage environments (mobile versus stationary use). This research demonstrates that new multimodal architectures can stabilize error-prone recognition technologies, and yield major improvements in system robustness.