Warped and warped-twice MVDR spectral estimation with and without filterbanks

  • Authors:
  • Matthias Wölfel

  • Affiliations:
  • Institut für Theoretische Informatik, Universität Karlsruhe (TH), Karlsruhe, Germany

  • Venue:
  • MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a novel extension to warped minimum variance distortionless response (MVDR) spectral estimation which allows to steer the resolution of the spectral envelope estimation to lower or higher frequencies while keeping the overall resolution of the estimate and the frequency axis fixed. This effect can be achieved by the introduction of a second bilinear transformation to the warped MVDR spectral estimation, but now in the frequency domain as opposed to the first bilinear transformation which is applied in the time domain, and a compensation step to adjust for the pre-emphasis of both bilinear transformations. In the feature extraction process of an automatic speech recognition system this novel extension allows to emphasize classification relevant characteristics while dropping classification irrelevant characteristics of speech features according to the characteristics of the signal to analyze. We have compared the novel extension to warped MVDR and the traditional Mel frequency cepstral coefficients (MFCC) on development and evaluation data of the Rich Transcription 2005 Spring Meeting Recognition Evaluation lecture meeting task. The results are promising and we are going to use the described warped and warped-twice front-end settings in the upcoming NIST evaluation.