HMM-Based Gain Modeling for Enhancement of Speech in Noise

Authors:
David Y. Zhao;W. Bastiaan Kleijn
Affiliations:
Sch. of Electr. Eng., R. Inst. of Technol., Stockholm;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 6

Speech enhancement with inventory style speech resynthesis

IEEE Transactions on Audio, Speech, and Language Processing
Improved noise minimum statistics estimation algorithm for using in a speech-passing noise-rejecting headset

EURASIP Journal on Advances in Signal Processing - Special issue on robust processing of nonstationary signals
Speech enhancement based on Sparse Code Shrinkage employing multiple speech models

Speech Communication
Speech enhancement using hidden Markov models in Mel-frequency domain

Speech Communication
A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Speech enhancement using generalized weighted β-order spectral amplitude estimator

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurate modeling and estimation of speech and noise gains facilitate good performance of speech enhancement methods using data-driven prior models. In this paper, we propose a hidden Markov model (HMM)-based speech enhancement method using explicit gain modeling. Through the introduction of stochastic gain variables, energy variation in both speech and noise is explicitly modeled in a unified framework. The speech gain models the energy variations of the speech phones, typically due to differences in pronunciation and/or different vocalizations of individual speakers. The noise gain helps to improve the tracking of the time-varying energy of nonstationary noise. The expectation-maximization (EM) algorithm is used to perform offline estimation of the time-invariant model parameters. The time-varying model parameters are estimated online using the recursive EM algorithm. The proposed gain modeling techniques are applied to a novel Bayesian speech estimator, and the performance of the proposed enhancement method is evaluated through objective and subjective tests. The experimental results confirm the advantage of explicit gain modeling, particularly for nonstationary noise sources