Effects of automated transcription quality on non-native speakers' comprehension in real-time computer-mediated communication

Authors:
Yingxin Pan;Danning Jiang;Lin Yao;Michael Picheny;Yong Qin
Affiliations:
IBM Research-China, Beijing, China;IBM Research-China, Beijing, China;Chinese Academy of Science,, Beijing, China;IBM Research - Watson, Yorktown Heights, USA;IBM Research- China, Beijing, China
Venue:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Year:
2010

Citing 8
Cited 3

Adaptive language modeling using the maximum entropy principle

HLT '93 Proceedings of the workshop on Human Language Technology
The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Multimodal redundancy across handwriting and speech during computer mediated human-human interactions

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Inclusion of deaf students in computer science classes using real-time speech transcription

Proceedings of the 12th annual SIGCSE conference on Innovation and technology in computer science education
Effects of real-time transcription on non-native speaker's comprehension in computer-mediated communications

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
End-to-end evaluation in simultaneous translation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
The ATR multilingual speech-to-speech translation system

IEEE Transactions on Audio, Speech, and Language Processing
Advances in speech transcription at IBM under the DARPA EARS program

IEEE Transactions on Audio, Speech, and Language Processing

Smarter social collaboration at IBM research

Proceedings of the ACM 2011 conference on Computer supported cooperative work
Effects of automated transcription delay on non-native speakers' comprehension in real-time computer-mediated communication

INTERACT'11 Proceedings of the 13th IFIP TC 13 international conference on Human-computer interaction - Volume Part I
Accessibility Evaluation of Classroom Captions

ACM Transactions on Accessible Computing (TACCESS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Real-time transcription has been shown to be valuable in facilitating non-native speakers' comprehension in real-time communication. Automated speech recognition (ASR) technology is a critical ingredient for its practical deployment. This paper presents a series of studies investigating how the quality of transcripts generated by an ASR system impacts user comprehension and subjective evaluation. Experiments are first presented comparing performance across three different transcription conditions: no transcript, a perfect transcript, and a transcript with Word Error Rate (WER) =20%. We found 20% WER was the most likely critical point for transcripts to be just acceptable and useful. Then we further examined a lower WER of 10% (a lower bound for today's state-of-the-art systems) employing the same experimental design. The results indicated that at 10% WER comprehension performance was significantly improved compared to the no-transcript condition. Finally, implications for further system development and design are discussed.