Facial animation in a nutshell: past, present and future
SAICSIT '06 Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
Expressive Face Animation Synthesis Based on Dynamic Mapping Method
ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Multimodal Human Machine Interactions in Virtual and Augmented Reality
Multimodal Signals: Cognitive and Algorithmic Issues
Towards Facial Gestures Generation by Speech Signal Analysis Using HUGE Architecture
Multimodal Signals: Cognitive and Algorithmic Issues
Emphatic visual speech synthesis
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Realistic visual speech synthesis based on hybrid concatenation method
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Cultural Specific Effects on the Recognition of Basic Emotions: A Study on Italian Subjects
USAB '09 Proceedings of the 5th Symposium of the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society on HCI and Usability for e-Inclusion
Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
[HUGE]: universal architecture for statistically based HUman GEsturing
IVA'06 Proceedings of the 6th international conference on Intelligent Virtual Agents
Dynamic mapping method based speech driven face animation system
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
On speech and gestures synchrony
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
Hi-index | 0.00 |
This work presents an integral system capable of generating animations with realistic dynamics, including the individualized nuances, of three-dimensional (3-D) human faces driven by speech acoustics. The system is capable of capturing short phenomena in the orofacial dynamics of a given speaker by tracking the 3-D location of various MPEG-4 facial points through stereovision. A perceptual transformation of the speech spectral envelope and prosodic cues are combined into an acoustic feature vector to predict 3-D orofacial dynamics by means of a nearest-neighbor algorithm. The Karhunen-Loe´ve transformation is used to identify the principal components of orofacial motion, decoupling perceptually natural components from experimental noise. We also present a highly optimized MPEG-4 compliant player capable of generating audio-synchronized animations at 60 frames/s. The player is based on a pseudo-muscle model augmented with a nonpenetrable ellipsoidal structure to approximate the skull and the jaw. This structure adds a sense of volume that provides more realistic dynamics than existing simplified pseudo-muscle-based approaches, yet it is simple enough to work at the desired frame rate. Experimental results on an audiovisual database of compact TIMIT sentences are presented to illustrate the performance of the complete system.