The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech

  • Authors:
  • Friedhelm Schwenker;Stefan Scherer;Yasmine M. Magdi;Günther Palm

  • Affiliations:
  • Institute of Neural Information Processing, University of Ulm, Ulm, Germany 89069;Institute of Neural Information Processing, University of Ulm, Ulm, Germany 89069;Computer Science and Engineering Department, German University in Cairo, Heliopolis, Egypt 11341;Institute of Neural Information Processing, University of Ulm, Ulm, Germany 89069

  • Venue:
  • ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Emotion recognition from speech is an important field of research in human-machine-interfaces, and has various applications, for instance for call centers. In the proposed classifier system RASTA-PLP features (perceptual linear prediction) are extracted from the speech signals. The first step is to compute an universal background model (UBM) representing a general structure of the underlying feature space of speech signals. This UBM is modeled as a Gaussian mixture model (GMM). After computing the UBM the sequence of feature vectors extracted from the utterance is used to re-train the UBM. From this GMM the mean vectors are extracted and concatenated to the so-called GMM supervectors which are then applied to a support vector machine classifier. The overall system has been evaluated by using utterances from the public Berlin emotional database. Utilizing the proposed features a recognition rate of 79% (utterance based) has been achieved which is close to the performance of humans on this database.