Classification of emotional speech using 3DEC hierarchical classifier

Authors:
A. Hassan;R. I. Damper
Affiliations:
Signals, Images, Systems Research Group, School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK;Signals, Images, Systems Research Group, School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK
Venue:
Speech Communication
Year:
2012

Citing 16
Cited 0

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Spoken dialogue technology: enabling the conversational user interface

ACM Computing Surveys (CSUR)
Recognition of Affective Communicative Intent in Robot-Directed Speech

Autonomous Robots
Baby ears: a recognition system for affective vocalizations

Speech Communication
Emotional speech: towards a new generation of databases

Speech Communication - Special issue on speech and emotion
An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech

Speech Communication
Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech

Computer Speech and Language
Private emotions versus social interaction: a data-driven approach towards analysing emotion in speech

User Modeling and User-Adapted Interaction
Speech Emotion Classification Using Machine Learning Algorithms

ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Emotion recognition from speech signals using new harmony features

Signal Processing
Survey on speech emotion recognition: Features, classification schemes, and databases

Pattern Recognition
Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies

IEEE Transactions on Affective Computing
Emotion recognition using a hierarchical binary decision tree approach

Speech Communication
AVEC 2011-the first international audio/visual emotion challenge

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The recognition of emotion from speech acoustics is an important problem in human-machine interaction, with many potential applications. In this paper, we first compare four ways to extend binary support vector machines (SVMs) to multiclass classification for recognising emotions from speech-namely two standard SVM schemes (one-versus-one and one-versus-rest) and two other methods (DAG and UDT) that form a hierarchy of classifiers, each making a distinct binary decision about class membership. These are trained and tested using 6552 features per speech sample extracted from three databases of acted emotional speech (DES, Berlin and Serbian) and a database of spontaneous speech (FAU Aibo Emotion Corpus) using the OpenEAR toolkit. Analysis of the errors made by these classifiers leads us to apply non-metric multi-dimensional scaling (NMDS) to produce a compact (two-dimensional) representation of the data suitable for guiding the choice of decision hierarchy. This representation can be interpreted in terms of the well-known valence-arousal model of emotion. We find that this model does not give a particularly good fit to the data: although the arousal dimension can be identified easily, valence is not well represented in the transformed data. We describe a new hierarchical classification technique whose structure is based on NMDS, which we call Data-Driven Dimensional Emotion Classification (3DEC). This new method is compared with the best of the four classifiers studied earlier and a state-of-the-art classification method on all four databases. We find no significant difference between these three approaches with respect to speaker-dependent performance. However, for the much more interesting and important case of speaker-independent emotion classification, 3DEC significantly outperforms the competitors.