Speech enhancement by joint statistical characterization in the Log Gabor Wavelet domain

Authors:
Suman Senapati;Sandipan Chakroborty;Goutam Saha
Affiliations:
Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, Kharagpur 721 302, India;Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, Kharagpur 721 302, India;Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, Kharagpur 721 302, India
Venue:
Speech Communication
Year:
2008

Citing 5
Cited 2

Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Corpora for the evaluation of speaker recognition systems

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement

EURASIP Journal on Applied Signal Processing
Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model

EURASIP Journal on Applied Signal Processing
De-noising by soft-thresholding

IEEE Transactions on Information Theory

Enhancement of noisy speech by temporal and spectral processing

Speech Communication
Optimal speech enhancement under signal presence uncertainty using Log Gabor Wavelet and Bayesian Joint Statistics

International Journal of Speech Technology

Quantified Score

Hi-index	0.02

Visualization

Abstract

In speech enhancement, Bayesian Marginal models cannot explain the inter-scale statistical dependencies of different wavelet scales. Simple non-linear estimators for wavelet-based denoising assume that the wavelet coefficients in different scales are independent in nature. However, wavelet coefficients have significant inter-scale dependencies. This paper introduces a new method that uses the inter-scale dependency between the coefficients and their parents by a Circularly Symmetric Probability Density Function (CS-PDF) related to the family of Spherically Invariant Random Processes (SIRPs) in Log Gabor Wavelet (LGW) domain and corresponding joint shrinkage estimators are derived by Maximum a Posteriori (MAP) estimation theory. The proposed work presents two different joint shrinkage estimators. In first, the inter-scale variance of LGW coefficients is kept constant which gives a closed form solution. In second, a relatively more complex approach is presented where variance is not constrained to be constant. It is also shown that the proposed methods show better performance when speech uncertainty is taken into consideration. The robustness of the proposed frameworks are tested on 50 speakers of POLYCOST and YOHO speech corpus in four different noisy environments against four established speech enhancement algorithms. Experimental results show that the proposed estimators yield a higher improvement in Segmental SNR (S-SNR) and also lower Log Spectral Distortion (LSD) compared to other estimators. In the second evaluation, the proposed speech enhancement techniques are found to give more robust Digit Recognition in noisy conditions on the AURORA 2.0 speech corpus compared to competing methods.