Using artificially reverberated training data in distant-talking ASR

Authors:
Tino Haderlein;Elmar Nöth;Wolfgang Herbordt;Walter Kellermann;Heinrich Niemann
Affiliations:
Chair for Pattern Recognition, (Informatik 5), University of Erlangen-Nuremberg, Erlangen, Germany;Chair for Pattern Recognition, (Informatik 5), University of Erlangen-Nuremberg, Erlangen, Germany;Chair of Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Erlangen, Germany;Chair of Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Erlangen, Germany;Chair for Pattern Recognition, (Informatik 5), University of Erlangen-Nuremberg, Erlangen, Germany
Venue:
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Year:
2005

Citing 3
Cited 2

Environmental conditions and acoustic transduction in hands-free speech recognition

Speech Communication - Special issue on robust speech recognition
Robust speech recognition in embedded system and PC applications

Robust speech recognition in embedded system and PC applications
Recognizing Reverberant Speech with RASTA - PLP

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2

Affective speaker state analysis in the presence of reverberation

International Journal of Speech Technology
Environmental adaptation with a small data set of the target domain

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic Speech Recognition (ASR) in reverberant rooms can be improved by choosing training data from the same acoustical environment as the test data. In a real-world application this is often not possible. A solution for this problem is to use speech signals from a close-talking microphone and reverberate them artificially with multiple room impulse responses. This paper shows results on recognizers whose training data differ in size and percentage of reverberated signals in order to find the best combination for data sets with different degrees of reverberation. The average error rate on a close-talking and a distant-talking test set could thus be reduced by 29% relative.