Semi-coupled hidden Markov model with state-based alignment strategy for audio-visual emotion recognition

Authors:
Jen-Chun Lin;Chung-Hsien Wu;Wen-Li Wei
Affiliations:
Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, R.O.C.;Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, R.O.C.;Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, R.O.C.
Venue:
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I
Year:
2011

Citing 11
Cited 0

Affective computing

Affective computing
Recognizing Action Units for Facial Expression Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Active Appearance Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Emotion recognition from text using semantic labels and separable mixture models

ACM Transactions on Asian Language Information Processing (TALIP)
A coupled HMM approach to video-realistic speech animation

Pattern Recognition
Audiovisual recognition of spontaneous interest within conversations

Proceedings of the 9th international conference on Multimodal interfaces
A robust multimodal approach for emotion recognition

Neurocomputing
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Audio-Visual Emotion Recognition Using Gaussian Mixture Models for Face and Voice

ISM '08 Proceedings of the 2008 Tenth IEEE International Symposium on Multimedia
Emotion Recognition of Affective Speech Based on Multiple Classifiers Using Acoustic-Prosodic Information and Semantic Labels

IEEE Transactions on Affective Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an approach to bi-modal emotion recognition based on a semi-coupled hidden Markov model (SC-HMM). A simplified state-based bi-modal alignment strategy in SC-HMM is proposed to align the temporal relation of states between audio and visual streams. Based on this strategy, the proposed SC-HMM can alleviate the problem of data sparseness and achieve better statistical dependency between states of audio and visual HMMs in most real world scenarios. For performance evaluation, audio-visual signals with four emotional states (happy, neutral, angry and sad) were collected. Each of the invited seven subjects was asked to utter 30 types of sentences twice to generate emotional speech and facial expression for each emotion. Experimental results show the proposed bi-modal approach outperforms other fusion-based bi-modal emotion recognition methods.