Real and imaginary modulation spectral subtraction for speech enhancement

Authors:
Yi Zhang;Yunxin Zhao
Affiliations:
Department of Computer Science, University of Missouri-Columbia, Columbia, MO 65211, USA;Department of Computer Science, University of Missouri-Columbia, Columbia, MO 65211, USA
Venue:
Speech Communication
Year:
2013

Citing 8
Cited 2

Audio Signal Processing for Next-Generation Multimedia Communication Systems

Audio Signal Processing for Next-Generation Multimedia Communication Systems
A geometric approach to spectral subtraction

Speech Communication
Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Speech Communication
The use of phase in complex spectrum subtraction for robust speech recognition

Computer Speech and Language
Multichannel signal separation: methods and analysis

IEEE Transactions on Signal Processing
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
New insights into the noise reduction Wiener filter

IEEE Transactions on Audio, Speech, and Language Processing
Phase-based dual-microphone robust speech enhancement

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Modulation domain blind speech separation in noisy environments

Speech Communication
Using STFT real and imaginary parts of modulation signals for MMSE-based speech enhancement

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel spectral subtraction method for noisy speech enhancement. Instead of taking the conventional approach of carrying out subtraction on the magnitude spectrum in the acoustic frequency domain, we propose to perform subtraction on the real and imaginary spectra separately in the modulation frequency domain, where the method is referred to as MRISS. By doing so, we are able to enhance magnitude as well as phase through spectral subtraction. We conducted objective and subjective evaluation experiments to compare the performance of the proposed MRISS method with three existing methods, including modulation frequency domain magnitude spectral subtraction (MSS), nonlinear spectral subtraction (NSS), and minimum mean square error estimation (MMSE). The objective evaluation used the criteria of segmental signal-to-noise ratio (Segmental SNR), PESQ, and average Itakura-Saito spectral distance (ISD). The subjective evaluation used a mean preference score with 14 participants. Both objective and subjective evaluation results have demonstrated that the proposed method outperformed the three existing speech enhancement methods. A further analysis has shown that the winning performance of the proposed MRISS method comes from improvements in the recovery of both acoustic magnitude and phase spectrum.