Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement

Authors:
Pei Chee Yong;Sven Nordholm;Hai Huyen Dam
Affiliations:
Curtin University, Kent Street, Bentley, WA 6102, Australia;Curtin University, Kent Street, Bentley, WA 6102, Australia;Curtin University, Kent Street, Bentley, WA 6102, Australia
Venue:
Speech Communication
Year:
2013

Citing 10
Cited 1

Speech enhancement based on a priori signal to noise estimation

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Speech spectral amplitude estimators using optimally shaped Gamma and Chi priors

Speech Communication
Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Speech Communication
Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator

Speech Communication
Voice activity detection based on multiple statistical models

IEEE Transactions on Signal Processing - Part I
Analysis of the Decision-Directed SNR Estimator for Speech Enhancement With Respect to Low-SNR and Transient Conditions

IEEE Transactions on Audio, Speech, and Language Processing
A Data-Driven Approach to A Priori SNR Estimation

IEEE Transactions on Audio, Speech, and Language Processing
Improved Signal-to-Noise Ratio Estimation for Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing
Evaluation of Objective Quality Measures for Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing

A hybrid noise suppression filter for accuracy enhancement of commercial speech recognizers in varying noisy conditions

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an a priori signal-to-noise ratio (SNR) estimator with a modified sigmoid gain function is proposed for real-time speech enhancement. The proposed sigmoid gain function has three parameters, which can be optimized such that they match conventional gain functions. In addition, the joint temporal dynamics between the SNR estimate and the spectral gain function is investigated to improve the performance of the speech enhancement scheme. As the widely-used decision-directed (DD) a priori SNR estimate has a well-known one-frame delay that leads to the degradation of speech quality, a modified a priori SNR estimator is proposed for the DD approach to overcome this delay. Evaluations are performed by utilizing the objective evaluation metric that measures the trade-off between the noise reduction, the speech distortion and the musical noise in the enhanced signal. The results are compared using the PESQ and the SNRseg measures as well as subjective listening tests. Simulation results show that the proposed gain function, which can flexibly model exponential distributions, is a potential alternative speech enhancement gain function.