Audio-based emotion recognition from natural conversations based on co-occurrence matrix and frequency domain energy distribution features

Authors:
Aya Sayedelahl;Pouria Fewzee;Mohamed S. Kamel;Fakhri Karray
Affiliations:
Pattern Analysis and Machine Intelligence Lab, Electrical and Computer Engineering, University of Waterloo, Canada;Pattern Analysis and Machine Intelligence Lab, Electrical and Computer Engineering, University of Waterloo, Canada;Pattern Analysis and Machine Intelligence Lab, Electrical and Computer Engineering, University of Waterloo, Canada;Pattern Analysis and Machine Intelligence Lab, Electrical and Computer Engineering, University of Waterloo, Canada
Venue:
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
Year:
2011

Citing 6
Cited 3

The nature of statistical learning theory

The nature of statistical learning theory
Emotions, speech and the ASR framework

Speech Communication - Special issue on speech and emotion
The production and recognition of emotions in speech: features and algorithms

International Journal of Human-Computer Studies - Application of affective computing in human—Computer interaction
Speech/Music Classification Using Occurrence Pattern of ZCR and STE

IITA '09 Proceedings of the 2009 Third International Symposium on Intelligent Information Technology Application - Volume 03
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
AVEC 2011-the first international audio/visual emotion challenge

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II

Robust continuous prediction of human emotions using multiscale dynamic cues

Proceedings of the 14th ACM international conference on Multimodal interaction
LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework

Image and Vision Computing
Audiovisual three-level fusion for continuous estimation of Russell's emotion circumplex

Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge

Quantified Score

Hi-index	0.00

Visualization

Abstract

Emotion recognition from natural speech is a very challenging problem. The audio sub-challenge represents an initial step towards building an efficient audio-visual based emotion recognition system that can detect emotions for real life applications (i.e. human-machine interaction and/or communication). The SEMAINE database, which consists of emotionally colored conversations, is used as the benchmark database. This paper presents our emotion recognition system from speech information in terms of positive/negative valence, and high and low arousal, expectancy and power. We introduce a new set of features including Co-Occurrence matrix based features as well as frequency domain energy distribution based features. Comparisons between well-known prosodic and spectral features and the new features are presented. Classification using the proposed features has shown promising results compared to the classical features on both the development and test data sets.