The ISL RT-07 Speech-to-Text System

  • Authors:
  • Matthias Wölfel;Sebastian Stüker;Florian Kraft

  • Affiliations:
  • Interactive Systems Laboratories Institut für Theoretische Informatik, Universität Karlsruhe (TH), Karlsruhe, Germany 76131;Interactive Systems Laboratories Institut für Theoretische Informatik, Universität Karlsruhe (TH), Karlsruhe, Germany 76131;Interactive Systems Laboratories Institut für Theoretische Informatik, Universität Karlsruhe (TH), Karlsruhe, Germany 76131

  • Venue:
  • Multimodal Technologies for Perception of Humans
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper describes the 2007 meeting speech-to-text system for lecture roomsdeveloped at the Interactive Systems Laboratories (ISL), for the multiple distant microphone condition, which has been evaluated in the RT-07 Rich Transcription Meeting Evaluation sponsored by the US National Institute of Standards and Technologies (NIST). We describe the principal differences between our current system and those submitted in previous years, namely the use of a signal adaptive front-end (realized by warped-twice warped minimum variance distortionless response spectral estimation), improved acoustic (including maximum mutual information estimation) and language models, cross adaptation between systems which differ in the front-end as well as the phoneme set, the use of a discriminative criteria instead of the signal-to-noise ratio for the selection of the channel to be used and the use of decoder based speech segmentation.