Speech emotion classification and public speaking skill assessment

  • Authors:
  • Tomas Pfister;Peter Robinson

  • Affiliations:
  • University of Cambridge, Computer Laboratory, Cambridge, UK;University of Cambridge, Computer Laboratory, Cambridge, UK

  • Venue:
  • HBU'10 Proceedings of the First international conference on Human behavior understanding
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents a new classification algorithm for real-time inference of emotions from the non-verbal features of speech. It identifies simultaneously occurring emotional states by recognising correlations between emotions and features such as pitch, loudness and energy. Pairwise classifiers are constructed for nine classes from the Mind Reading emotion corpus, yielding an average cross-validation accuracy of 89% for the pairwise machines and 86% for the fused machine. The paper also shows a novel application of the classifier for assessing public speaking skills, achieving an average cross-validation accuracy of 81%. Optimisation of support vector machine coefficients is shown to improve the accuracy by up to 25%. The classifier outperforms previous research on the same emotion corpus and achieves real-time performance.