Signal adaptive spectral envelope estimation for robust speech recognition

  • Authors:
  • Matthias Wölfel

  • Affiliations:
  • Institut für Theoretische Informatik, Universität Karslruhe (TH), Am Fasanengarten 5, 76131 Karlsruhe, Germany

  • Venue:
  • Speech Communication
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a novel spectral envelope estimation technique which adapts to the characteristics of the observed signal. This is possible via the introduction of a second bilinear transformation into warped minimum variance distortionless response (MVDR) spectral envelope estimation. As opposed to the first bilinear transformation, however, which is applied in the time domain, the second bilinear transformation must be applied in the frequency domain. This extension enables the resolution of the spectral envelope estimate to be steered to lower or higher frequencies, while keeping the overall resolution of the estimate and the frequency axis fixed. When embedded in the feature extraction process of an automatic speech recognition system, it provides for the emphasis of the characteristics of speech features that are relevant for robust classification, while simultaneously suppressing characteristics that are irrelevant for classification. The change in resolution may be steered, for each observation window, by the normalized first autocorrelation coefficient. To evaluate the proposed adaptive spectral envelope technique, dubbed warped-twice MVDR, we use two objective functions: class separability and word error rate. Our test set consists of development and evaluation data as provided by NIST for the Rich Transcription 2005 Spring Meeting Recognition Evaluation. For both measures, we observed consistent improvements for several speaker-to-microphone distances. In average, over all distances, the proposed front-end reduces the word error rate by 4% relative compared to the widely used mel-frequency cepstral coefficients as well as perceptual linear prediction.