Speech enhancement based on undecimated wavelet packet-perceptual filterbanks and MMSE--STSA estimation in various noise environments

  • Authors:
  • Hacı Taşmaz;Ergun Erçelebi

  • Affiliations:
  • Vocational High School, University of Gaziantep, 27310 Gaziantep, Turkey;Department of Electrical and Electronics Engineering, University of Gaziantep, 27310 Gaziantep, Turkey

  • Venue:
  • Digital Signal Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we proposed a new speech enhancement system, which integrates a perceptual filterbank and minimum mean square error-short time spectral amplitude (MMSE-STSA) estimation, modified according to speech presence uncertainty. The perceptual filterbank was designed by adjusting undecimated wavelet packet decomposition (UWPD) tree, according to critical bands of psycho-acoustic model of human auditory system. The MMSE-STSA estimation (modified according to speech presence uncertainty) was used for estimation of speech in undecimated wavelet packet domain. The perceptual filterbank provides a good auditory representation (sufficient frequency resolution), good perceptual quality of speech and low computational load. The MMSE-STSA estimator is based on a priori SNR estimation. A priori SNR estimation, which is a key parameter in MMSE-STSA estimator, was performed by using ''decision directed method.'' The ''decision directed method'' provides a trade off between noise reduction and signal distortion when correctly tuned. The experiments were conducted for various noise types. The results of proposed method were compared with those of other popular methods, Wiener estimation and MMSE-log spectral amplitude (MMSE-LSA) estimation in frequency domain. To test the performance of the proposed speech enhancement system, three objective quality measurement tests (SNR, segSNR and Itakura-Saito distance (ISd)) were conducted for various noise types and SNRs. Experimental results and objective quality measurement test results proved the performance of proposed speech enhancement system. The proposed speech enhancement system provided sufficient noise reduction and good intelligibility and perceptual quality, without causing considerable signal distortion and musical background noise.