Using asymmetric windows in automatic speech recognition

  • Authors:
  • Robert Rozman;Dušan M. Kodek

  • Affiliations:
  • University of Ljubljana, Faculty of Computer and Information Science, Laboratory for Architecture and Signal Processing, Traška 25, 1001 Ljubljana, Slovenia;University of Ljubljana, Faculty of Computer and Information Science, Laboratory for Architecture and Signal Processing, Traška 25, 1001 Ljubljana, Slovenia

  • Venue:
  • Speech Communication
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper considers the windowing problem of the short-time frequency analysis that is used in speech recognition systems (SRS). Since human hearing is relatively insensitive to short-time phase distortion of the speech signal there is no apparent reason for the use of symmetric windows which give a linear phase response. Furthermore, phase information is usually completely disregarded in SRS. This should be contrasted with the well-known fact that relaxation of the linearity constraint on window phase results in a better magnitude response and shorter time delay. These observations form a strong argument in favor of the research presented in this paper. First, a general overview of the role that windows play in the frequency analysis stage of SRS is presented. Important properties for speech recognition are highlighted and potential advantages of asymmetric windows are presented. Among them the shorter time delay and the better magnitude response are most important. Two possible design methods for asymmetric windows are discussed. Since little is known about window influence on SRS performance the design methods are first considered from a frequency analysis point of view. This is followed by practical evaluations on real SRS. Expectations were confirmed by the results. The proposed asymmetric windows increased the robustness of elementary, isolated and connected speech recognition on a variety of adverse test conditions. This is particularly true for the case of a combination of additive and low pass convolutional distortions. Further research on asymmetric windows and on the parameterization process as a whole is suggested.