Speech intelligibility improvement using convolutive blind source separation assisted by denoising algorithms

  • Authors:
  • Jedrzej Kocinski

  • Affiliations:
  • Institute of Acoustics, Faculty of Physics, Adam Mickiewicz University, 85 Umultowska str., 61-614 Poznan, Poland

  • Venue:
  • Speech Communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present study is concerned with the blind source separation (BSS) of speech and speech-shaped noise sources. All recordings were carried out in an anechoic chamber using a dummy head (two microphones, one in each ear). The program which implements the algorithm for BSS of convolutive mixtures introduced by Parra and Spence [Parra, L., Spence, C., 2000a. Convolutive blind source separation of non-stationary sources. IEEE Trans. Speech Audio Process. 8(3), 320-327 (US Patent US6167417)] was used to separate out the signals. In the postprocessing phase two different denoising algorithms were used. The first was based on a minimum mean-square error log-spectral amplitude estimator [Ephraim, E., Malah, D., 1985. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Speech Audio Process. ASSP-33(2), 443-445], while the second one was based on Wiener filter in which the concept of an a priori signal-to-noise estimation presented by Ephraim (as mentioned above) was applied [Scalart, P., Filho, J.V., 1996. Speech enhancement based on a priori signal to noise estimation. IEEE Internat. Conf. Acoust. Speech Signal Process. 1, 629-632]. Non-sense word tests were used as a target speech in both cases while one or two disturbing sources were used as interferences. The speech intelligibility before and after the BSS was measured for three subjects with audiologically normal hearing. Next the speech signal after BSS was denoised and presented to the same listeners. The results revealed some ambiguities caused by the insufficient number of microphones compared to the number of sound sources. For one disturbance only, the intelligibility improvement was significant. However, when there were two disturbances and the target speech, the separation was much poorer. The additional denoising, as could be expected, raises the intelligibility slightly. Although the BSS method requires more research on optimization, the results of the investigation imply that it may be applied to hearing aids in the future.