The effect of fuzzy training targets on voice quality classification

  • Authors:
  • Stefan Scherer;John Kane;Christer Gobl;Friedhelm Schwenker

  • Affiliations:
  • Institute of Creative Technologies, University of Southern California, United States,Institute of Neural Information Processing, Ulm University, Germany;Phonetics and Speech Laboratory, Trinity College Dublin, Ireland;Phonetics and Speech Laboratory, Trinity College Dublin, Ireland;Institute of Neural Information Processing, Ulm University, Germany

  • Venue:
  • MPRSS'12 Proceedings of the First international conference on Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The dynamic use of voice qualities in spoken language can reveal useful information on a speaker's attitude, mood and affective states. This information may be desirable for a range of speech technology applications. However, annotation of voice quality may frequently be inconsistent across raters. But whom should one trust or is the truth somewhere in between? The current study looks first to describe a voice quality feature set that is suitable for differentiating voice qualities on a tense to breathy dimension. These features are used as inputs to a fuzzy-input fuzzy-output support vector machine (F2SVM) algorithm, to automatically classify the voice qualities. The F2SVM is compared to standard approaches and shows promising results. Performances for cross validation, leave one speaker out, and cross corpus experiments of around 90% are achieved.