Distributed multichannel speech enhancement with minimum mean-square error short-time spectral amplitude, log-spectral amplitude, and spectral phase estimation

Authors:
Marek B. Trawicki;Michael T. Johnson
Affiliations:
Marquette University, Department of Electrical and Computer Engineering, Speech and Signal Processing Laboratory, P.O. Box 1881, Milwaukee, WI 53201-1881, USA;Marquette University, Department of Electrical and Computer Engineering, Speech and Signal Processing Laboratory, P.O. Box 1881, Milwaukee, WI 53201-1881, USA
Venue:
Signal Processing
Year:
2012

Citing 5
Cited 1

Practical approaches to speech coding

Practical approaches to speech coding
Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Multichannel direction-independent speech enhancement using spectral amplitude estimation

EURASIP Journal on Applied Signal Processing
Optimal distributed microphone phase estimation

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Evaluation of Objective Quality Measures for Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing

Fast communication: Improved modulation spectrum enhancement methods for robust speech recognition

Signal Processing

Quantified Score

Hi-index	0.08

Visualization

Abstract

In this paper, the authors present optimal multichannel frequency domain estimators for minimum mean-square error (MMSE) short-time spectral amplitude (STSA), log-spectral amplitude (LSA), and spectral phase estimation in a widely distributed microphone configuration. The estimators utilize Rayleigh and Gaussian statistical models for the speech prior and noise likelihood with a diffuse noise field for the surrounding environment. Based on the Signal-to-Noise Ratio (SNR) and Segmental Signal-to-Noise Ratio (SSNR) along with the Log-Likelihood Ratio (LLR) and Perceptual Evaluation of Speech Quality (PESQ) as objective metrics, the multichannel LSA estimator decreases background noise and speech distortion and increases speech quality compared to the baseline single channel STSA and LSA estimators, where the optimal multichannel spectral phase estimator serves as a significant quantity to the improvements, and demonstrates robustness due to time alignment and attenuation factor estimation. Overall, the optimal distributed microphone spectral estimators show strong results in noisy environments with application to many consumer, industrial, and military products.