Mutual disambiguation of recognition errors in a multimodel architecture

  • Authors:
  • Sharon Oviatt

  • Affiliations:
  • Center for Human-Computer Communication, Oregon Graduate Institute of Science and Technology, P.O. Box 91000, Portland, OR

  • Venue:
  • Proceedings of the SIGCHI conference on Human Factors in Computing Systems
  • Year:
  • 1999

Quantified Score

Hi-index 0.05

Visualization

Abstract

As a new generation of multimodal/media systems begins to defineitself, researchers are attempting to learn how to combinedifferent modes into strategically integrated whole systems. Intheory, well designed multimodal systems should be able tointegrate complementary modalities in a manner that supports mutualdisambiguation (MD) of errors and leads to more robust performance.In this study, over 2,000 multimodal utterances by both native andaccented speakers of English were processed by a multimodal system,and then logged and analyzed. The results confirmed that multimodalsystems can indeed support significant levels of MD, and alsohigher levels of MD for the more challenging accented users. As aresult, although speech recognition as a stand-alone performed farmore poorly for accented speakers, their multimodal recognitionrates did not differ from those of native speakers. Implicationsare discussed for the development of future multimodalarchitectures that can perform in a more robust and stable mannerthan individual recognition technologies. Also discussed is thedesign of interfaces that support diversity in tangible ways, andthat function well under challenging real-world usageconditions,