Minimum Mean-Squared Error Estimation of Mel-Frequency Cepstral Coefficients Using a Novel Distortion Model

Authors:
K. M. Indrebo;R. J. Povinelli;M. T. Johnson
Affiliations:
Dept. of Electr. & Comput. Eng., Marquette Univ., Marquette, WI;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2008

Citing 0
Cited 5

Improved wavelet feature extraction using kernel analysis for text independent speaker recognition

Digital Signal Processing
MMSE estimation of log-filterbank energies for robust speech recognition

Speech Communication
An evaluation study on speech feature densities for Bayesian estimation in robust ASR

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Comparative evaluation of single-channel MMSE-Based noise reduction schemes for speech recognition

Journal of Electrical and Computer Engineering
Environmental robust speech and speaker recognition through multi-channel histogram equalization

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a new method for statistical estimation of Mel-frequency cepstral coefficients (MFCCs) in noisy speech signals is proposed. Previous research has shown that model-based feature domain enhancement of speech signals for use in robust speech recognition can improve recognition accuracy significantly. These methods, which typically work in the log spectral or cepstral domain, must face the high complexity of distortion models caused by the nonlinear interaction of speech and noise in these domains. In this paper, an additive cepstral distortion model (ACDM) is developed, and used with a minimum mean-squared error (MMSE) estimator for recovery of MFCC features corrupted by additive noise. The proposed ACDM-MMSE estimation algorithm is evaluated on the Aurora2 database, and is shown to provide significant improvement in word recognition accuracy over the baseline.