A data-driven approach to optimizing spectral speech enhancement methods for various error criteria

Authors:
Jan Erkelens;Jesper Jensen;Richard Heusdens
Affiliations:
Department of Mediamatics, Information and Communication Theory Group, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands;Department of Mediamatics, Information and Communication Theory Group, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands;Department of Mediamatics, Information and Communication Theory Group, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands
Venue:
Speech Communication
Year:
2007

Citing 1
Cited 6

Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model

EURASIP Journal on Applied Signal Processing

Correlation-based and model-based blind single-channel late-reverberation suppression in noisy time-varying acoustical environments

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Improving speech intelligibility in noise using environment-optimized algorithms

IEEE Transactions on Audio, Speech, and Language Processing
Psychoacoustically constrained and distortion minimized speech enhancement

IEEE Transactions on Audio, Speech, and Language Processing
Impact of SNR and gain-function over- and under-estimation on speech intelligibility

Speech Communication
Communication strategies for a computerized caregiver for individuals with Alzheimer's disease

SLPAT '12 Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies
Environment-adaptive speech enhancement for bilateral cochlear implants using a single processor

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Gain functions for spectral noise suppression have been derived in literature for some error criteria and statistical models. These gain functions are only optimal when the statistical model is correct and the speech and noise spectral variances are known. Unfortunately, the speech distributions are unknown and can at best be determined conditionally on the estimated spectral variance. We show that the ''decision-directed'' approach for speech spectral variance estimation can have an important bias at low SNRs, which generally leads to too much speech suppression. To correct for such estimation inaccuracies and adapt to the unknown speech statistics, we propose a general optimization procedure, with two gain functions applied in parallel. A conventional algorithm is run in the background and is used for a priori SNR estimation only. For the final reconstruction a different gain function is used, optimized for a wide range of signal-to-noise ratios. The gain function providing for the reconstruction is trained on a speech database, by minimizing a relevant error criterion. The procedure is illustrated for several error criteria. The method compares favorably to current state-of-the-art methods, and needs less smoothing in the decision-directed spectral variance estimator.