Fast and Robust Features for Prosodic Classification

  • Authors:
  • Jan Buckow;Volker Warnke;Richard Huber;Anton Batliner;Elmar Nöth;Heinrich Niemann

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In our previous research, we have shown that prosody can be used to dramatically improve the performance of the automatic speech translation system Verbmobil [5,7,8]. In Verbmobil, prosodic information is made available to the different modules of the system by annotating the output of a word recognizer with prosodic markers. These markers are determined in a classification process. The computation of the prosodic features used for classification was previously based on a time alignment of the phoneme sequence of the recognized words. The phoneme segmentation was needed for the normalization of duration and energy features. This time alignment was very expensive in terms of computational effort and memory requirement. In our new approach the normalization is done on the word level with precomputed duration and energy statistics, thus the phoneme segmentation can be avoided. With the new set of prosodic features better classification results can be achieved, the features extraction can be sped up by 64 %, and the memory requirements are even reduced by 92%.