Modeling naturalistic affective states via facial, vocal, and bodily expressions recognition

Authors:
Kostas Karpouzis;George Caridakis;Loic Kessous;Noam Amir;Amaryllis Raouzaiou;Lori Malatesta;Stefanos Kollias
Affiliations:
Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Greece;Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Greece;Tel Aviv Academic College of Engineering, Tel Aviv, Israel;Tel Aviv Academic College of Engineering, Tel Aviv, Israel;Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Greece;Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Greece;Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Greece
Venue:
ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing
Year:
2007

Citing 24
Cited 5

Affective computing

Affective computing
Automatic Analysis of Facial Expressions: The State of the Art

IEEE Transactions on Pattern Analysis and Machine Intelligence
ELIZA—a computer program for the study of natural language communication between man and machine

Communications of the ACM
Toward Machine Emotional Intelligence: Analysis of Affective Physiological State

IEEE Transactions on Pattern Analysis and Machine Intelligence - Graph Algorithms and Computer Vision
Detecting Faces in Images: A Survey

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
2002 Index, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24

IEEE Transactions on Pattern Analysis and Machine Intelligence
Formational Parameters and Adaptive Prototype Instantiation for MPEG-4 Compliant Gesture Synthesis

CA '02 Proceedings of the Computer Animation
Bimodal Emotion Recognition

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
The production and recognition of emotions in speech: features and algorithms

International Journal of Human-Computer Studies - Application of affective computing in human—Computer interaction
Automatic recognition of facial expressions using hidden markov models and estimation of expression intensity

Automatic recognition of facial expressions using hidden markov models and estimation of expression intensity
Joint processing of audio-visual information for the recognition of emotional expressions in human-computer interaction

Joint processing of audio-visual information for the recognition of emotional expressions in human-computer interaction
SVM-based Nonparametric Discriminant Analysis, An Application to Face Detection

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Toward computers that recognize and respond to user emotion

IBM Systems Journal
Handbook Of Pattern Recognition And Computer Vision

Handbook Of Pattern Recognition And Computer Vision
Socially Aware Computation and Communication

Computer
Affective multimodal human-computer interaction

Proceedings of the 13th annual ACM international conference on Multimedia
Human-Centered Multimedia: Culture, Deployment, and Access

IEEE MultiMedia
2005 Special Issue: Emotion recognition through facial expression analysis based on a neurofuzzy network

Neural Networks - Special issue: Emotion and brain
Parameterized facial expression synthesis based on MPEG-4

EURASIP Journal on Applied Signal Processing
Locating and extracting the eye in human face images

Pattern Recognition
Learning bayesian network classifiers for facial expression recognition using both labeled and unlabeled data

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Multi-stream confidence analysis for audio-visual affect recognition

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Emotion analysis in man-machine interaction systems

MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction

Automatic temporal segment detection and affect recognition from face and body display

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Human computing and machine understanding of human behavior: a survey

ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing
Sentic avatar: multimodal affective conversational agent with common sense

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Consistent but modest: a meta-analysis on unimodal and multimodal affect detection accuracies from 30 studies

Proceedings of the 14th ACM international conference on Multimodal interaction
A multimodal fuzzy inference system using a continuous facial expression representation for emotion detection

Proceedings of the 14th ACM international conference on Multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Affective and human-centered computing have attracted a lot of attention during the past years, mainly due to the abundance of devices and environments able to exploit multimodal input from the part of the users and adapt their functionality to their preferences or individual habits. In the quest to receive feedback from the users in an unobtrusive manner, the combination of facial and hand gestures with prosody information allows us to infer the users' emotional state, relying on the best performing modality in cases where one modality suffers from noise or bad sensing conditions. In this paper, we describe a multi-cue, dynamic approach to detect emotion in naturalistic video sequences. Contrary to strictly controlled recording conditions of audiovisual material, the proposed approach focuses on sequences taken from nearly real world situations. Recognition is performed via a 'Simple Recurrent Network' which lends itself well to modeling dynamic events in both user's facial expressions and speech. Moreover this approach differs from existing work in that it models user expressivity using a dimensional representation of activation and valence, instead of detecting discrete 'universal emotions', which are scarce in everyday human-machine interaction. The algorithm is deployed on an audiovisual database which was recorded simulating human-human discourse and, therefore, contains less extreme expressivity and subtle variations of a number of emotion labels.