Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator

Authors:
Kuldip Paliwal;Belinda Schwerin;Kamil Wójcicki
Affiliations:
Signal Processing Laboratory, Griffith School of Engineering, Griffith University, Nathan, QLD 4111, Australia;Signal Processing Laboratory, Griffith School of Engineering, Griffith University, Nathan, QLD 4111, Australia;Signal Processing Laboratory, Griffith School of Engineering, Griffith University, Nathan, QLD 4111, Australia
Venue:
Speech Communication
Year:
2012

Citing 15
Cited 2

Robust speech recognition using the modulation spectrogram

Speech Communication - Special issue on robust speech recognition
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
The Modulation Spectrogram: In Pursuit of an Invariant Representation of Speech

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
Digital Speech Transmission: Enhancement, Coding And Error Concealment

Digital Speech Transmission: Enhancement, Coding And Error Concealment
Speech enhancement based on a priori signal to noise estimation

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Subjective comparison and evaluation of speech enhancement algorithms

Speech Communication
Joint acoustic and modulation frequency

EURASIP Journal on Applied Signal Processing
Discrete-time speech signal processing: principles and practice

Discrete-time speech signal processing: principles and practice
Automatic recognition of speech emotion using long-term spectro-temporal features

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Speech Communication
Modulation spectral features for robust far-field speaker identification

IEEE Transactions on Audio, Speech, and Language Processing
Theory and Applications of Digital Speech Processing

Theory and Applications of Digital Speech Processing
A non-intrusive quality and intelligibility measure of reverberant and dereverberated speech

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Role of modulation magnitude and phase spectrum towards speech intelligibility

Speech Communication
Modulation-domain Kalman filtering for single-channel speech enhancement

Speech Communication

Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement

Speech Communication
Using STFT real and imaginary parts of modulation signals for MMSE-based speech enhancement

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we investigate the enhancement of speech by applying MMSE short-time spectral magnitude estimation in the modulation domain. For this purpose, the traditional analysis-modification-synthesis framework is extended to include modulation domain processing. We compensate the noisy modulation spectrum for additive noise distortion by applying the MMSE short-time spectral magnitude estimation algorithm in the modulation domain. A number of subjective experiments were conducted. Initially, we determine the parameter values that maximise the subjective quality of stimuli enhanced using the MMSE modulation magnitude estimator. Next, we compare the quality of stimuli processed by the MMSE modulation magnitude estimator to those processed using the MMSE acoustic magnitude estimator and the modulation spectral subtraction method, and show that good improvement in speech quality is achieved through use of the proposed approach. Then we evaluate the effect of including speech presence uncertainty and log-domain processing on the quality of enhanced speech, and find that this method works better with speech uncertainty. Finally we compare the quality of speech enhanced using the MMSE modulation magnitude estimator (when used with speech presence uncertainty) with that enhanced using different acoustic domain MMSE magnitude estimator formulations, and those enhanced using different modulation domain based enhancement algorithms. Results of these tests show that the MMSE modulation magnitude estimator improves the quality of processed stimuli, without introducing musical noise or spectral smearing distortion. The proposed method is shown to have better noise suppression than MMSE acoustic magnitude estimation, and improved speech quality compared to other modulation domain based enhancement methods considered.