A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments

Authors:
Daniel P. K. Lun; Tak-Wai Shen;K. C. Ho
Affiliations:
Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hong Kong, China;Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hong Kong, China;Dept. of Electr. & Comput. Eng., Univ. of Missouri, Columbia, MO, USA
Venue:
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Year:
2014

Citing 10
Cited 0

Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
A constrained sequential EM algorithm for speech enhancement

Neural Networks
Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation

IEEE Transactions on Audio, Speech, and Language Processing
HMM-Based Gain Modeling for Enhancement of Speech in Noise

IEEE Transactions on Audio, Speech, and Language Processing
Improved A Posteriori Speech Presence Probability Estimation Based on a Likelihood Ratio With Fixed Priors

IEEE Transactions on Audio, Speech, and Language Processing
Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold

IEEE Transactions on Audio, Speech, and Language Processing
Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors

IEEE Transactions on Audio, Speech, and Language Processing
De-noising by soft-thresholding

IEEE Transactions on Information Theory
An EM algorithm for wavelet-based image restoration

IEEE Transactions on Image Processing
Wavelet based speech presence probability estimator for speech enhancement

Digital Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Voiced speeches have a quasi-periodic nature that allows them to be compactly represented in the cepstral domain. It is a distinctive feature compared with noises. Recently, the temporal cepstrum smoothing (TCS) algorithm was proposed and was shown to be effective for speech enhancement in non-stationary noise environments. However, the missing of an automatic parameter updating mechanism limits its adaptability to noisy speeches with abrupt changes in SNR across time frames or frequency components. In this paper, an improved speech enhancement algorithm based on a novel expectation-maximization (EM) framework is proposed. The new algorithm starts with the traditional TCS method which gives the initial guess of the periodogram of the clean speech. It is then applied to an L1 norm regularizer in the M-step of the EM framework to estimate the true power spectrum of the original speech. It in turn enables the estimation of the a-priori SNR and is used in the E-step, which is indeed a logmmse gain function, to refine the estimation of the clean speech periodogram. The M-step and E-step iterate alternately until converged. A notable improvement of the proposed algorithm over the traditional TCS method is its adaptability to the changes (even abrupt changes) in SNR of the noisy speech. Performance of the proposed algorithm is evaluated using standard measures based on a large set of speech and noise signals. Evaluation results show that a significant improvement is achieved compared to conventional approaches especially in non-stationary noise environment where most conventional algorithms fail to perform.