Improving speech intelligibility in noise using environment-optimized algorithms

Authors:
Gibak Kim;Philipos C. Loizou
Affiliations:
School of Electronic Engineering, College of Information and Communication, Daegu University, Daegu, Korea;Department of Electrical Engineering, University of Texas at Dallas, Richardson, TX
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 9
Cited 2

Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
A Laplacian-based MMSE estimator for speech enhancement

Speech Communication
A data-driven approach to optimizing spectral speech enhancement methods for various error criteria

Speech Communication
Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model

EURASIP Journal on Applied Signal Processing
Codebook driven short-term predictor parameter estimation for speech enhancement

IEEE Transactions on Audio, Speech, and Language Processing
Environment-Optimized Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing
Monaural speech segregation based on pitch tracking and amplitude modulation

IEEE Transactions on Neural Networks

Speech processor design for cochlear implants

VDAT'12 Proceedings of the 16th international conference on Progress in VLSI Design and Test
Robust speech recognition based on binaural speech enhancement system as a preprocessing step

Proceedings of the Third Symposium on Information and Communication Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

While most speech enhancement algorithms improve speech quality, they may not improve speech intelligibility in noise. This paper focuses on the development of an algorithm that can be optimized for a specific acoustic environment and improve speech intelligibility. The proposed method decomposes the input signal into time-frequency (T-F) units and makes binary decisions, based on a Bayesian classifier, as to whether each T-F unit is dominated by the target signal or the noise masker. Target-dominated T-F units are retained while masker-dominated T-F units are discarded. The Bayesian classifier is trained for each acoustic environment using an incremental approach that continuously updates the model parameters as more data become available. Listening experiments were conducted to assess the intelligibility of speech synthesized using the incrementally adapted models as a function of the number of training sentences. Results indicated substantial improvements in intelligibility (over 60% in babble at 5 dB SNR) with as few as ten training sentences in babble and at least 80 sentences in other noisy conditions.