Hidden Markov model-based speech emotion recognition

Authors:
B. Schuller;G. Rigoll;M. Lang
Affiliations:
Inst. for Human-Comput. Commun., Technische Univ. Munchen, Germany;Inst. for Human-Comput. Commun., Technische Univ. Munchen, Germany;Inst. for Human-Comput. Commun., Technische Univ. Munchen, Germany
Venue:
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Year:
2003

Citing 1
Cited 22

Verbal and non-verbal cues in the communication of emotions

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04

An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech

Speech Communication
Multistyle classification of speech under stress using feature subset selection based on genetic algorithms

Speech Communication
A Systematic Comparison of Different HMM Designs for Emotion Recognition from Acted and Spontaneous Speech

ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Speech Emotion Recognition Using Spectral Entropy

ICIRA '08 Proceedings of the First International Conference on Intelligent Robotics and Applications: Part II
Speech Emotion Classification on a Riemannian Manifold

PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Being bored? Recognising natural interest by extensive audiovisual integration for real-life application

Image and Vision Computing
Emotion recognition from speech via boosted Gaussian mixture models

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Spoken emotion recognition through optimum-path forest classification using glottal features

Computer Speech and Language
Emotion recognition and conversion for mandarin speech

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Shift window based framework for emotional change detection of speech

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Fiction support for realistic portrayals of fear-type emotional manifestations

Computer Speech and Language
EmotionSense: a mobile phones based adaptive platform for experimental social psychology research

Proceedings of the 12th ACM international conference on Ubiquitous computing
A prototype for a conversational companion for reminiscing about images

Computer Speech and Language
Survey on speech emotion recognition: Features, classification schemes, and databases

Pattern Recognition
Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Speech Communication
Relevance vector machine based speech emotion recognition

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
An emotion space model for recognition of emotions in spoken chinese

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Objective measures, sensors and computational techniques for stress recognition and classification: A survey

Computer Methods and Programs in Biomedicine
An intuitive style control technique in HMM-based expressive speech synthesis using subjective style intensity and multiple-regression global variance model

Speech Communication
Recognizing human activities using a layered markov architecture

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part I
Clustering approach to characterize haptic expressions of emotions

ACM Transactions on Applied Perception (TAP)
Exploiting Psychological Factors for Interaction Style Recognition in Spoken Conversation

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this contribution we introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared throughout the paper. Within the first method a global statistics framework of an utterance is classified by Gaussian mixture models using derived features of the raw pitch and energy contour of the speech signal. A second method introduces increased temporal complexity applying continuous hidden Markov models considering several states using low-level instantaneous features instead of global statistics. The paper addresses the design of working recognition engines and results achieved with respect to the alluded alternatives. A speech corpus consisting of acted and spontaneous emotion samples in German and English language is described in detail. Both engines have been tested and trained using this equivalent speech corpus. Results in recognition of seven discrete emotions exceeded 86% recognition rate. As a basis of comparison the similar judgment of human deciders classifying the same corpus at 79.8% recognition rate was analyzed.