Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement

  • Authors:
  • Pei Chee Yong;Sven Nordholm;Hai Huyen Dam

  • Affiliations:
  • Curtin University, Kent Street, Bentley, WA 6102, Australia;Curtin University, Kent Street, Bentley, WA 6102, Australia;Curtin University, Kent Street, Bentley, WA 6102, Australia

  • Venue:
  • Speech Communication
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, an a priori signal-to-noise ratio (SNR) estimator with a modified sigmoid gain function is proposed for real-time speech enhancement. The proposed sigmoid gain function has three parameters, which can be optimized such that they match conventional gain functions. In addition, the joint temporal dynamics between the SNR estimate and the spectral gain function is investigated to improve the performance of the speech enhancement scheme. As the widely-used decision-directed (DD) a priori SNR estimate has a well-known one-frame delay that leads to the degradation of speech quality, a modified a priori SNR estimator is proposed for the DD approach to overcome this delay. Evaluations are performed by utilizing the objective evaluation metric that measures the trade-off between the noise reduction, the speech distortion and the musical noise in the enhanced signal. The results are compared using the PESQ and the SNRseg measures as well as subjective listening tests. Simulation results show that the proposed gain function, which can flexibly model exponential distributions, is a potential alternative speech enhancement gain function.