Fundamentals of speech recognition
Fundamentals of speech recognition
Affective computing
Automatic Analysis of Facial Expressions: The State of the Art
IEEE Transactions on Pattern Analysis and Machine Intelligence
Emotions and personality in agent design
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Embodied contextual agent in information delivering application
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2
MAUI: a multimodal affective user interface
Proceedings of the tenth ACM international conference on Multimedia
Vision-Based Gesture Recognition: A Review
GW '99 Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Emotion Recognition Using a Cauchy Naive Bayes Classifier
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 1 - Volume 1
Toward Multimodal Interpretation in a Natural Speech/Gesture Interface
ICIIS '99 Proceedings of the 1999 International Conference on Information Intelligence and Systems
Analysis of emotion recognition using facial expressions, speech and multimodal information
Proceedings of the 6th international conference on Multimodal interfaces
Online face detection and user authentication
Proceedings of the 13th annual ACM international conference on Multimedia
Product HMMs for audio-visual continuous speech recognition using facial animation parameters
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Using noninvasive wearable computers to recognize human emotions from physiological signals
EURASIP Journal on Applied Signal Processing
Active affective State detection and user assistance with dynamic bayesian networks
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
IVA '07 Proceedings of the 7th international conference on Intelligent Virtual Agents
Image and Vision Computing
Automatic temporal segment detection and affect recognition from face and body display
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Multimodal information fusion application to human emotion recognition from face and speech
Multimedia Tools and Applications
Emotion recognition using bimodal data fusion
Proceedings of the 12th International Conference on Computer Systems and Technologies
Fusion of audio- and visual cues for real-life emotional human robot interaction
DAGM'11 Proceedings of the 33rd international conference on Pattern recognition
Hybrid fusion approach for detecting affects from multichannel physiology
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I
Multimodal affect recognition in intelligent tutoring systems
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
3D Virtual worlds and the metaverse: Current status and future possibilities
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
During face to face communication, it has been suggested that as much as 70% of what people communicate when talking directly with others is through paralanguage involving multiple modalities combined together (e.g. voice tone and volume, body language). In an attempt to render humancomputer interaction more similar to human-human communication and enhance its naturalness, research on sensory acquisition and interpretation of single modalities of human expressions have seen ongoing progress over the last decade. These progresses are rendering current research on artificial sensor fusion of multiple modalities an increasingly important research domain in order to reach better accuracy of congruent messages on the one hand, and possibly to be able to detect incongruent messages across multiple modalities (incongruency being itself a message about the nature of the information being conveyed). Accurate interpretation of emotional signals - quintessentially multimodal - would hence particularly benefit from multimodal sensor fusion and interpretation algorithms. In this paper we provide a state of the art multimodal fusion and describe one way to implement a generic framework for multimodal emotion recognition. The system is developed within the MAUI framework [31] and Scherer's Component Process Theory (CPT) [49, 50, 51, 24, 52], with the goal to be modular and adaptive. We want the designed framework to be able to accept different single and multi modality recognition systems and to automatically adapt the fusion algorithm to find optimal solutions. The system also aims to be adaptive to channel (and system) reliability.