Speech enhancement based on joint time-frequency segmentation

  • Authors:
  • C. Tantibundhit;F. Pernkopf;G. Kubin

  • Affiliations:
  • MedIntelligence and Innovation Laboratory, Thammasat University, Thailand;Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria;Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria

  • Venue:
  • ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an algorithm to decompose speech into transient and non-transient components. Our algorithm, the joint timefrequency segmentation algorithm, uses the wavelet packet coefficients of the speech signal and represents them as tiles of a time-frequency representation adapted to the characteristics of the signal itself. Any wavelet packet coefficient, whose tiling height is larger than or equal to the tiling width is characterized as a transient coefficient and vice versa for the nontransient coefficient. The transient component is selectively amplified and recombined with the original speech to generate the modified speech with energy adjusted to be equal to the energy of the original speech. The psychoacoustic tests performed with fourteen human listeners show that the speech modification significantly improves speech intelligibility in background noise, i.e., for 10% absolute at 0dB to 31% absolute at −30dB.