Recognizing Human Emotional State From Audiovisual Signals*

Authors:
Y. Wang;L. Guan
Affiliations:
-;-
Venue:
IEEE Transactions on Multimedia
Year:
2008

Citing 0
Cited 4

Toward natural and efficient human computer interaction

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Multimedia multimodal methodologies

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Recognizing human emotional state based on the phase information of the two dimensional fractional Fourier transform

PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
Kernel fusion of audio and visual information for emotion recognition

ICIAR'11 Proceedings of the 8th international conference on Image analysis and recognition - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine recognition of human emotional state is an important component for efficient human-computer interaction. The majority of existing works address this problem by utilizing audio signals alone, or visual information only. In this paper, we explore a systematic approach for recognition of human emotional state from audiovisual signals. The audio characteristics of emotional speech are represented by the extracted prosodic, Mel-frequency Cepstral Coefficient (MFCC), and formant frequency features. A face detection scheme based on HSV color model is used to detect the face from the background. The visual information is represented by Gabor wavelet features. We perform feature selection by using a stepwise method based on Mahalanobis distance. The selected audiovisual features are used to classify the data into their corresponding emotions. Based on a comparative study of different classification algorithms and specific characteristics of individual emotion, a novel multiclassifier scheme is proposed to boost the recognition performance. The feasibility of the proposed system is tested over a database that incorporates human subjects from different languages and cultural backgrounds. Experimental results demonstrate the effectiveness of the proposed system. The multiclassifier scheme achieves the best overall recognition rate of 82.14%.