Description and generation of spherically invariant speech-model signals
Signal Processing
Fundamentals of speech recognition
Fundamentals of speech recognition
Speech Enhancement with Reduction of Noise Components in the Wavelet Domain
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Speech enhancement based on a priori signal to noise estimation
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Subjective comparison and evaluation of speech enhancement algorithms
Speech Communication
Bayesian marginal statistics for speech enhancement using log Gabor wavelet
International Journal of Speech Technology
Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency
IEEE Transactions on Signal Processing
Evaluation of Objective Quality Measures for Speech Enhancement
IEEE Transactions on Audio, Speech, and Language Processing
Simultaneous optimum detection and estimation of signals in noise
IEEE Transactions on Information Theory
A representation theorem and its applications to spherically-invariant random processes
IEEE Transactions on Information Theory
On spherically invariant random processes (Corresp.)
IEEE Transactions on Information Theory
De-noising by soft-thresholding
IEEE Transactions on Information Theory
Wavelet-based image estimation: an empirical Bayes approach using Jeffrey's noninformative prior
IEEE Transactions on Image Processing
Hi-index | 0.00 |
This paper investigates the problem of speech enhancement when only a single microphone is used and the statistics of the interfering noise and speech are not available a priori. Thus it seeks to address a pitfall of many current enhancement techniques and look towards a system which would have application in the real world. This paper focuses on Log Gabor Wavelet (LGW) based Long Term Squared Spectral Amplitude estimator using the Maximum a Posteriori (MAP) criterion. To begin with, long term cepstral mean subtraction technique with LGW is proposed to suppress telephone channel and handset effect from the speech signals. Then a novel speech enhancer by MAP based Bayesian Bivariate Model is developed to suppress the background noise. This work also introduces an inter-scale dependency between the coefficients and their parents by a Circularly Symmetric probability density function related to the family of Spherically Invariant Random Process (SIRPs). The corresponding joint estimator is derived by MAP estimation theory. The inter-scale noise variance of the coefficients is kept constant which gives closed form solution. Consideration of speech presence uncertainty (SPU) estimator is another contribution to the proposed estimator. Therefore, in this paper, the main contributions are; (i) combination of LGW, SIRPs and SPU for background noise reduction, (ii) LGW and Long Term Cepstral Mean Subtraction to reduce the effects of both telephone channel and handsets, (iii) circularly Symmetric probability density function to exploit the inter-scale dependency between the coefficients and their parents and corresponding joint estimators are derived by MAP estimation theory, (iv) the inter-scale noise variance of the coefficients is kept constant which gives closed form solution, (v) idea refines the estimate of the magnitudes by scaling them by the SPU probability. Extensive comparisons are done among the proposed and existing speech enhancement algorithms on NOIZEUS speech database which has different types of noise. We report the subjective and objective evaluations encompassing four classes of algorithms: spectral subtractive, subspace, statistical model based and Wiener type against the proposed methods. Experimental results show that the proposed estimator yields a higher improvement in Segmental SNR (SSNR), lower Log Area Ratio (LAR), Weighted Spectral Slope (WSS) distortion, higher Perceptual Evaluation of Speech Quality (PESQ) and Mean Opinion Score (MOS) compared to the existing speech enhancement algorithms. For SSNR measure, the proposed methods show 2 dB of improvement than existing methods for almost every Noise sources. For MOS measure, the proposed methods show improvement than existing methods for almost every Noise sources. Therefore the proposed methods are aiming to enhance the speech quality as well as intelligibility at a time.