A readability evaluation of real-time crowd captions in the classroom

Authors:
Raja S. Kushalnagar;Walter S. Lasecki;Jeffrey P. Bigham
Affiliations:
Rochester Institute of Technology, Rochester, NY, USA;University of Rochester, Rochester, NY, USA;University of Rochester, Rochester, NY, USA
Venue:
Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility
Year:
2012

Citing 9
Cited 2

Inclusion of deaf students in computer science classes using real-time speech transcription

Proceedings of the 12th annual SIGCSE conference on Innovation and technology in computer science education
Restoring punctuation and capitalization in transcribed speech

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
VizWiz: nearly real-time answers to visual questions

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Real-time crowd control of existing interfaces

Proceedings of the 24th annual ACM symposium on User interface software and technology
Crowds in two seconds: enabling realtime crowd-powered interfaces

Proceedings of the 24th annual ACM symposium on User interface software and technology
Scribe4Me: evaluating a mobile sound transcription tool for the deaf

UbiComp'06 Proceedings of the 8th international conference on Ubiquitous Computing
Enriching speech recognition with automatic detection of sentence boundaries and disfluencies

IEEE Transactions on Audio, Speech, and Language Processing
Real-time captioning by groups of non-experts

Proceedings of the 25th annual ACM symposium on User interface software and technology
Online quality control for real-time crowd captioning

Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility

Real-time captioning by non-experts with legion scribe

Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility
Accessibility Evaluation of Classroom Captions

ACM Transactions on Accessible Computing (TACCESS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deaf and hard of hearing individuals need accommodations that transform aural to visual information, such as captions that are generated in real-time to enhance their access to spoken information in lectures and other live events. The captions produced by professional captionists work well in general events such as community or legal meetings, but is often unsatisfactory in specialized content events such as higher education classrooms. In addition, it is hard to hire professional captionists, especially those that have experience in specialized content areas, as they are scarce and expensive. The captions produced by commercial automatic speech recognition (ASR) software are far cheaper, but is often perceived as unreadable due to ASR's sensitivity to accents, background noise and slow response time. We ran a study to evaluate the readability of captions generated by a new crowd captioning approach versus professional captionists and ASR. In this approach, captions are typed by classmates into a system that aligns and merges the multiple incomplete caption streams into a single, comprehensive real-time transcript. Our study asked 48 deaf and hearing readers to evaluate transcripts produced by a professional captionist, ASR and crowd captioning software respectively and found the readers preferred crowd captions over professional captions and ASR.