MAP Estimators for Speech Enhancement Under Normal and Rayleigh Inverse Gaussian Distributions

Authors:
Richard C. Hendriks;Rainer Martin
Affiliations:
Dept. of Mediamatics, Delft Univ. of Technol.;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 2

Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation

Computer Speech and Language
An evaluation study on speech feature densities for Bayesian estimation in robust ASR

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new class of estimators for speech enhancement in the discrete Fourier transform (DFT) domain, where we consider a multidimensional normal inverse Gaussian (MNIG) distribution for the speech DFT coefficients. The MNIG distribution can model a wide range of processes, from heavy-tailed to less heavy-tailed processes. Under the MNIG distribution complex DFT and amplitude estimators are derived. In contrast to other estimators, the suppression characteristics of the MNIG-based estimators can be adapted online to the underlying distribution of the speech DFT coefficients. Compared to noise suppression algorithms based on preselected super-Gaussian distributions, the MNIG-based complex DFT and amplitude estimators lead to a performance improvement in terms of segmental signal-to-noise ratio (SNR) in the order of 0.3 to 0.6 dB and 0.2 to 0.6 dB, respectively