ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Corpora for the evaluation of speaker recognition systems
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement
EURASIP Journal on Applied Signal Processing
Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model
EURASIP Journal on Applied Signal Processing
De-noising by soft-thresholding
IEEE Transactions on Information Theory
Enhancement of noisy speech by temporal and spectral processing
Speech Communication
International Journal of Speech Technology
Hi-index | 0.02 |
In speech enhancement, Bayesian Marginal models cannot explain the inter-scale statistical dependencies of different wavelet scales. Simple non-linear estimators for wavelet-based denoising assume that the wavelet coefficients in different scales are independent in nature. However, wavelet coefficients have significant inter-scale dependencies. This paper introduces a new method that uses the inter-scale dependency between the coefficients and their parents by a Circularly Symmetric Probability Density Function (CS-PDF) related to the family of Spherically Invariant Random Processes (SIRPs) in Log Gabor Wavelet (LGW) domain and corresponding joint shrinkage estimators are derived by Maximum a Posteriori (MAP) estimation theory. The proposed work presents two different joint shrinkage estimators. In first, the inter-scale variance of LGW coefficients is kept constant which gives a closed form solution. In second, a relatively more complex approach is presented where variance is not constrained to be constant. It is also shown that the proposed methods show better performance when speech uncertainty is taken into consideration. The robustness of the proposed frameworks are tested on 50 speakers of POLYCOST and YOHO speech corpus in four different noisy environments against four established speech enhancement algorithms. Experimental results show that the proposed estimators yield a higher improvement in Segmental SNR (S-SNR) and also lower Log Spectral Distortion (LSD) compared to other estimators. In the second evaluation, the proposed speech enhancement techniques are found to give more robust Digit Recognition in noisy conditions on the AURORA 2.0 speech corpus compared to competing methods.