Using the Amazon Mechanical Turk to transcribe and annotate meeting speech for extractive summarization

  • Authors:
  • Matthew Marge;Satanjeev Banerjee;Alexander I. Rudnicky

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to its complexity, meeting speech provides a challenge for both transcription and annotation. While Amazon's Mechanical Turk (MTurk) has been shown to produce good results for some types of speech, its suitability for transcription and annotation of spontaneous speech has not been established. We find that MTurk can be used to produce high-quality transcription and describe two techniques for doing so (voting and corrective). We also show that using a similar approach, high quality annotations useful for summarization systems can also be produced. In both cases, accuracy is comparable to that obtained using trained personnel.