Emotion recognition from speech by combining databases and fusion of classifiers

  • Authors:
  • Iulia Lefter;Leon J. M. Rothkrantz;Pascal Wiggers;David A. Van Leeuwen

  • Affiliations:
  • Delft University of Technology, The Netherlands and The Netherlands Defense Academy;Delft University of Technology, The Netherlands and The Netherlands Defense Academy;Delft University of Technology, The Netherlands;TNO Human Factors, The Netherlands

  • Venue:
  • TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore possibilities for enhancing the generality, portability and robustness of emotion recognition systems by combining data-bases and by fusion of classifiers. In a first experiment, we investigate the performance of an emotion detection system tested on a certain database given that it is trained on speech from either the same database, a different database or a mix of both. We observe that generally there is a drop in performance when the test database does not match the training material, but there are a few exceptions. Furthermore, the performance drops when a mixed corpus of acted databases is used for training and testing is carried out on real-life recordings. In a second experiment we investigate the effect of training multiple emotion detectors, and fusing these into a single detection system. We observe a drop in the Equal Error Rate (EER) from 19.0% on average for 4 individual detectors to 4.2% when fused using FoCal [1].