A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions

Authors:
Jinyu Li;Li Deng;Dong Yu;Yifan Gong;Alex Acero
Affiliations:
Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA;Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA;Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA;Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA;Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, USA
Venue:
Computer Speech and Language
Year:
2009

Citing 6
Cited 9

Speech recognition in noisy environments: a survey

Speech Communication
Speech recognition in noisy environments using first-order vector Taylor series

Speech Communication
On stochastic feature and model compensation approaches to robust speech recognition

Speech Communication - Special issue on robust speech recognition
Acoustical and Environmental Robustness in Automatic Speech Recognition

Acoustical and Environmental Robustness in Automatic Speech Recognition
Speech recognition in noisy environments

Speech recognition in noisy environments
Speech Recognition over Digital Channels: Robustness And Standards

Speech Recognition over Digital Channels: Robustness And Standards

Multi-environment model adaptation based on vector Taylor series for robust speech recognition

Pattern Recognition
An evaluation study on speech feature densities for Bayesian estimation in robust ASR

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Comparative evaluation of single-channel MMSE-Based noise reduction schemes for speech recognition

Journal of Electrical and Computer Engineering
Environmental robust speech and speaker recognition through multi-channel histogram equalization

Neurocomputing
Bayesian on-line spectral change point detection: a soft computing approach for on-line ASR

International Journal of Speech Technology
Importance sampling to compute likelihoods of noise-corrupted speech

Computer Speech and Language
Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams

Speech Communication
Feature normalization based on non-extensive statistics for speech recognition

Speech Communication
A MAP-based Online Estimation Approach to Ensemble Speaker and Speaking Environment Modeling

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present our recent development of a model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using multi-sources of information including a nonlinear environment-distortion model in the cepstral domain, the posterior probabilities of all the Gaussians in speech recognizer, and truncated vector Taylor series (VTS) approximation. Second, the estimated noise and channel parameters are used to adapt the static and dynamic portions (delta and delta-delta) of the HMM means and variances. This two-step algorithm enables joint compensation of both additive and convolutive distortions (JAC). The hallmark of our new approach is the use of a nonlinear, phase-sensitive model of acoustic distortion that captures phase asynchrony between clean speech and the mixing noise. In the experimental evaluation using the standard Aurora 2 task, the proposed Phase-JAC/VTS algorithm achieves 93.32% word accuracy using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation. This represents high recognition performance on this task without discriminative training of the HMM system. The experimental results show that the phase term, which was missing in all previous HMM adaptation work, contributes significantly to the achieved high recognition accuracy.