Noisy speech recognition performance of discriminative HMMs

Authors:
Jun Du;Peng Liu;Frank Soong;Jian-Lai Zhou;Ren-Hua Wang
Affiliations:
University of Science and Technology of China, Hefei;Microsoft Research Asia, Beijing;Microsoft Research Asia, Beijing;Microsoft Research Asia, Beijing;University of Science and Technology of China, Hefei
Venue:
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Year:
2006

Citing 4
Cited 0

Speech recognition in noisy environments: a survey

Speech Communication
MMIE training of large vocabulary recognition systems

Speech Communication
An Efficient Image Similarity Measure Based on Approximations of KL-Divergence Between Two Gaussian Mixtures

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Noise-robust HMMs based on minimum error classification

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discriminatively trained HMMs are investigated in both clean and noisy environments in this study. First, a recognition error is defined at different levels including string, word, phone and acoustics. A high resolution error measure in terms of minimum divergence (MD) is specifically proposed and investigated along with other error measures. Using two speaker-independent continuous digit databases, Aurora2(English) and CNDigits (Mandarin Chinese), the recognition performance of recognizers, which are trained in terms of different error measures and using different training modes, is evaluated under different noise and SNR conditions. Experimental results show that discriminatively trained models performed better than the maximum likelihood baseline systems. Specifically, for MD trained systems, relative error reductions of 17.62% and 18.52% were obtained applying multi-training on Aurora2 and CNDigits, respectively.