An MMSE Estimator for Speech Enhancement Under a Combined Stochastic–Deterministic Speech Model

  • Authors:
  • R. C. Hendriks;R. Heusdens;J. Jensen

  • Affiliations:
  • Dept. of Mediamatics, Delft Univ. of Technol.;-;-

  • Venue:
  • IEEE Transactions on Audio, Speech, and Language Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although many discrete Fourier transform (DFT) domain-based speech enhancement methods rely on stochastic models to derive clean speech estimators, like the Gaussian and Laplace distribution, certain speech sounds clearly show a more deterministic character. In this paper, we study the use of a deterministic model in combination with the well-known stochastic models for speech enhancement. We derive a minimum mean-square error (MMSE) estimator under a combined stochastic-deterministic speech model with speech presence uncertainty and show that for different distributions of the DFT coefficients the combined stochastic-deterministic speech model leads to improved performance of approximately 0.8 dB segmental signal-to-noise ratio (SNR) over the use of a stochastic model alone. Evaluation with perceptual evaluation of speech quality (PESQ) shows performance improvements of approximately 0.15 on an MOS scale