Using artificially reverberated training data in distant-talking ASR

  • Authors:
  • Tino Haderlein;Elmar Nöth;Wolfgang Herbordt;Walter Kellermann;Heinrich Niemann

  • Affiliations:
  • Chair for Pattern Recognition, (Informatik 5), University of Erlangen-Nuremberg, Erlangen, Germany;Chair for Pattern Recognition, (Informatik 5), University of Erlangen-Nuremberg, Erlangen, Germany;Chair of Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Erlangen, Germany;Chair of Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Erlangen, Germany;Chair for Pattern Recognition, (Informatik 5), University of Erlangen-Nuremberg, Erlangen, Germany

  • Venue:
  • TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic Speech Recognition (ASR) in reverberant rooms can be improved by choosing training data from the same acoustical environment as the test data. In a real-world application this is often not possible. A solution for this problem is to use speech signals from a close-talking microphone and reverberate them artificially with multiple room impulse responses. This paper shows results on recognizers whose training data differ in size and percentage of reverberated signals in order to find the best combination for data sets with different degrees of reverberation. The average error rate on a close-talking and a distant-talking test set could thus be reduced by 29% relative.