Effects of automated transcription quality on non-native speakers' comprehension in real-time computer-mediated communication

  • Authors:
  • Yingxin Pan;Danning Jiang;Lin Yao;Michael Picheny;Yong Qin

  • Affiliations:
  • IBM Research-China, Beijing, China;IBM Research-China, Beijing, China;Chinese Academy of Science,, Beijing, China;IBM Research - Watson, Yorktown Heights, USA;IBM Research- China, Beijing, China

  • Venue:
  • Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Real-time transcription has been shown to be valuable in facilitating non-native speakers' comprehension in real-time communication. Automated speech recognition (ASR) technology is a critical ingredient for its practical deployment. This paper presents a series of studies investigating how the quality of transcripts generated by an ASR system impacts user comprehension and subjective evaluation. Experiments are first presented comparing performance across three different transcription conditions: no transcript, a perfect transcript, and a transcript with Word Error Rate (WER) =20%. We found 20% WER was the most likely critical point for transcripts to be just acceptable and useful. Then we further examined a lower WER of 10% (a lower bound for today's state-of-the-art systems) employing the same experimental design. The results indicated that at 10% WER comprehension performance was significantly improved compared to the no-transcript condition. Finally, implications for further system development and design are discussed.