Stereo-based stochastic mapping with discriminative training for noise robust speech recognition

Authors:
Xiaodong Cui;Mohamed Afify;Yuqing Gao
Affiliations:
IBM T. J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY, 10598, USA;Orange Lab, Smart Village, Cairo, Egypt;IBM T. J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY, 10598, USA
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 2

The IBM speech-to-speech translation system for smartphone: Improvements for resource-constrained tasks

Computer Speech and Language
Stereo hidden Markov modeling for noise robust speech recognition

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an enhanced stochastic mapping technique in the discriminative feature (fMPE) space that exploits stereo data for noise robust LVCSR. Both MMSE and MAP estimates of the mapping are given and the performance of the two is investigated. Due to the iterative nature of the MAP estimate, we show that combining MMSE and MAP estimates is possible and yields superior performance than each individual estimate. A multi-style discriminative training with minimum phone error (MPE) criterion is further applied to the compensated features and obtains significant performance improvement on real-world noisy test sets.