Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

Authors:
Margarita Kotti;Fabio Paternò
Affiliations:
ISTI-CNR, Pisa, Italy 56124;ISTI-CNR, Pisa, Italy 56124
Venue:
International Journal of Speech Technology
Year:
2012

Citing 32
Cited 0

Affective computing

Affective computing
What Size Test Set Gives Good Error Rate Estimates?

IEEE Transactions on Pattern Analysis and Machine Intelligence
Linear Prediction of Speech

Linear Prediction of Speech
Digital Filters and Signal Processing

Digital Filters and Signal Processing
Vocal communication of emotion: a review of research paradigms

Speech Communication - Special issue on speech and emotion
An introduction to variable and feature selection

The Journal of Machine Learning Research
Exploiting emotions to disambiguate dialogue acts

Proceedings of the 9th international conference on Intelligent user interfaces
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Emotive alert: HMM-based emotion detection in voicemail messages

Proceedings of the 10th international conference on Intelligent user interfaces
Improving automotive safety by pairing driver emotion and car voice emotion

CHI '05 Extended Abstracts on Human Factors in Computing Systems
Affective multimodal human-computer interaction

Proceedings of the 13th annual ACM international conference on Multimedia
Mandarin Emotional Speech Recognition Based on SVM and NN

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
Human computing and machine understanding of human behavior: a survey

Proceedings of the 8th international conference on Multimodal interfaces
Audiovisual recognition of spontaneous interest within conversations

Proceedings of the 9th international conference on Multimodal interfaces
A survey of affect recognition methods: audio, visual and spontaneous expressions

Proceedings of the 9th international conference on Multimodal interfaces
EmoVoice -- A Framework for Online Recognition of Emotions from Voice

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection

Expert Systems with Applications: An International Journal
Variational Gaussian Mixture Models for Speech Emotion Recognition

ICAPR '09 Proceedings of the 2009 Seventh International Conference on Advances in Pattern Recognition
Comparing emotions using acoustics and human perceptual dimensions

CHI '09 Extended Abstracts on Human Factors in Computing Systems
Using affective avatars and rich multimedia content for education of children with autism

Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
Audio-Based Emotion Recognition in Judicial Domain: A Multilayer Support Vector Machines Approach

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Statistical Evaluation of Speech Features for Emotion Recognition

ICDT '09 Proceedings of the 2009 Fourth International Conference on Digital Telecommunications
Being bored? Recognising natural interest by extensive audiovisual integration for real-life application

Image and Vision Computing
An adaptive framework for acoustic monitoring of potential hazards

EURASIP Journal on Audio, Speech, and Music Processing
Emotion recognition from speech signals using new harmony features

Signal Processing
Class-level spectral features for emotion recognition

Speech Communication
A learning approach to hierarchical feature selection and aggregation for audio classification

Pattern Recognition Letters
Survey on speech emotion recognition: Features, classification schemes, and databases

Pattern Recognition
Non-negative tensor factorization applied to music genre classification

IEEE Transactions on Audio, Speech, and Language Processing
Affect Detection: An Interdisciplinary Review of Models, Methods, and Their Applications

IEEE Transactions on Affective Computing
Employing fujisaki's intonation model parameters for emotion recognition

SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisaki's model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers' error rates and then to evaluate the information expressed by the classifiers' confusion matrices.