Quantile based histogram equalization for noise robust large vocabulary speech recognition

Authors:
F. Hilger;H. Ney
Affiliations:
Telenet GmbH Kommunikationsysteme, Munich, Germany;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2006

Citing 0
Cited 11

Linear histogram equalization in the acoustic feature domain for speech recognition over Bluetooth™ channels

Mobility '07 Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology
Recognition of noisy speech: a comparative survey of robust model architecture and feature enhancement

EURASIP Journal on Audio, Speech, and Music Processing
Higher order cepstral moment normalization for improved robust speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
Sub-band feature statistics compensation techniques based on discrete wavelet transform for robust speech recognition

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
A study on the generalization capability of acoustic models for robust speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environments

IEEE Transactions on Audio, Speech, and Language Processing
Multichannel Cepstral Domain Feature Warping for Robust Speech Recognition

Proceedings of the 2011 conference on Neural Nets WIRN10: Proceedings of the 20th Italian Workshop on Neural Nets
Robust speech recognition using spatial-temporal feature distribution characteristics

Pattern Recognition Letters
Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing

International Journal of Speech Technology
Fast communication: Improved modulation spectrum enhancement methods for robust speech recognition

Signal Processing
Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

The noise robustness of automatic speech recognition systems can be improved by reducing an eventual mismatch between the training and test data distributions during feature extraction. Based on the quantiles of these distributions the parameters of transformation functions can be reliably estimated with small amounts of data. This paper will give a detailed review of quantile equalization applied to the Mel scaled filter bank, including considerations about the application in online systems and improvements through a second transformation step that combines neighboring filter channels. The recognition tests have shown that previous experimental observations on small vocabulary recognition tasks can be confirmed on the larger vocabulary Aurora 4 noisy Wall Street Journal database. The word error rate could be reduced from 45.7% to 25.5% (clean training) and from 19.5% to 17.0% (multicondition training).