Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator

  • Authors:
  • Kuldip Paliwal;Belinda Schwerin;Kamil Wójcicki

  • Affiliations:
  • Signal Processing Laboratory, Griffith School of Engineering, Griffith University, Nathan, QLD 4111, Australia;Signal Processing Laboratory, Griffith School of Engineering, Griffith University, Nathan, QLD 4111, Australia;Signal Processing Laboratory, Griffith School of Engineering, Griffith University, Nathan, QLD 4111, Australia

  • Venue:
  • Speech Communication
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we investigate the enhancement of speech by applying MMSE short-time spectral magnitude estimation in the modulation domain. For this purpose, the traditional analysis-modification-synthesis framework is extended to include modulation domain processing. We compensate the noisy modulation spectrum for additive noise distortion by applying the MMSE short-time spectral magnitude estimation algorithm in the modulation domain. A number of subjective experiments were conducted. Initially, we determine the parameter values that maximise the subjective quality of stimuli enhanced using the MMSE modulation magnitude estimator. Next, we compare the quality of stimuli processed by the MMSE modulation magnitude estimator to those processed using the MMSE acoustic magnitude estimator and the modulation spectral subtraction method, and show that good improvement in speech quality is achieved through use of the proposed approach. Then we evaluate the effect of including speech presence uncertainty and log-domain processing on the quality of enhanced speech, and find that this method works better with speech uncertainty. Finally we compare the quality of speech enhanced using the MMSE modulation magnitude estimator (when used with speech presence uncertainty) with that enhanced using different acoustic domain MMSE magnitude estimator formulations, and those enhanced using different modulation domain based enhancement algorithms. Results of these tests show that the MMSE modulation magnitude estimator improves the quality of processed stimuli, without introducing musical noise or spectral smearing distortion. The proposed method is shown to have better noise suppression than MMSE acoustic magnitude estimation, and improved speech quality compared to other modulation domain based enhancement methods considered.