The rich transcription 2005 spring meeting recognition evaluation

  • Authors:
  • Jonathan G. Fiscus;Nicolas Radde;John S. Garofolo;Audrey Le;Jerome Ajot;Christophe Laprun

  • Affiliations:
  • National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD

  • Venue:
  • MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the design and results of the Rich Transcription Spring 2005 (RT-05S) Meeting Recognition Evaluation. This evaluation is the third in a series of community-wide evaluations of language technologies in the meeting domain. For 2005, four evaluation tasks were supported. These included a speech-to-text (STT) transcription task and three diarization tasks: “Who Spoke When”, “Speech Activity Detection”, and “Source Localization.” The latter two were first-time experimental proof-of-concept tasks and were treated as “dry runs”. For the STT task, the lowest word error rate for the multiple distant microphone condition was 30.0% which represented an impressive 33% relative reduction from the best result obtained in the last such evaluation – the Rich Transcription Spring 2004 Meeting Recognition Evaluation. For the diarization “Who Spoke When” task, the lowest diarization error rate was 18.56% which represented a 19% relative reduction from that of RT-04S.