A dimensional approach to emotion recognition of speech from movies

Authors:
Theodoros Giannakopoulos;Aggelos Pikrakis;Sergios Theodoridis
Affiliations:
Dept. of Informatics and Telecommunications, University of Athens, Greece;Dept. of Informatics and Telecommunications, University of Athens, Greece;Dept. of Informatics and Telecommunications, University of Athens, Greece
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 7

Multi-actor emotion recognition in movies using a bimodal approach

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
Automatic speech emotion recognition using modulation spectral features

Speech Communication
Application of nonlinear dynamics characterization to emotional speech

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Machine Recognition of Music Emotion: A Review

ACM Transactions on Intelligent Systems and Technology (TIST)
Speech-based recognition of self-reported and observed emotion in a dimensional space

Speech Communication
Perspectives on the evaluation of affective quality in social software

International Journal of Web Based Communities
Nonlinear dynamics characterization of emotional speech

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a novel method for extracting affective information from movies, based on speech data. The method is based on a 2-D representation of speech emotions (Emotion Wheel). The goal is twofold. First, to investigate whether the Emotion Wheel offers a good representation for emotions associated with speech signals. To this end, several humans have manually annotated speech data from movies using the Emotion Wheel and the level of disagreement has been computed as a measure of representation quality. The results indicate that the emotion wheel is a good representation of emotions in speech data. Second, a regression approach is adopted, in order to predict the location of an unknown speech segment in the Emotion Wheel. Each speech segment is represented by a vector of ten audio features. The results indicate that the resulting architecture can estimate emotion states of speech from movies, with sufficient accuracy.