An information-theoretic view of array processing
IEEE Transactions on Audio, Speech, and Language Processing
A speech enhancement algorithm based on a chi MRF model of the speech STFT amplitudes
IEEE Transactions on Audio, Speech, and Language Processing
Low complexity DFT-domain noise PSD tracking using high-resolution periodograms
EURASIP Journal on Advances in Signal Processing
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
An evaluation study on speech feature densities for Bayesian estimation in robust ASR
Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Generalized gamma distributed Bayesian estimator under speech presence probability
ACS'11 Proceedings of the 11th WSEAS international conference on Applied computer science
NOLISP'09 Proceedings of the 2009 international conference on Advances in Nonlinear Speech Processing
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
This paper considers techniques for single-channel speech enhancement based on the discrete Fourier transform (DFT). Specifically, we derive minimum mean-square error (MMSE) estimators of speech DFT coefficient magnitudes as well as of complex-valued DFT coefficients based on two classes of generalized gamma distributions, under an additive Gaussian noise assumption. The resulting generalized DFT magnitude estimator has as a special case the existing scheme based on a Rayleigh speech prior, while the complex DFT estimators generalize existing schemes based on Gaussian, Laplacian, and Gamma speech priors. Extensive simulation experiments with speech signals degraded by various additive noise sources verify that significant improvements are possible with the more recent estimators based on super-Gaussian priors. The increase in perceptual evaluation of speech quality (PESQ) over the noisy signals is about 0.5 points for street noise and about 1 point for white noise, nearly independent of input signal-to-noise ratio (SNR). The assumptions made for deriving the complex DFT estimators are less accurate than those for the magnitude estimators, leading to a higher maximum achievable speech quality with the magnitude estimators.