Natural head motion synthesis driven by acoustic prosodic features: Virtual Humans and Social Agents

Authors:
Carlos Busso;Zhigang Deng;Ulrich Neumann;Shrikanth Narayanan
Affiliations:
Integrated Media Systems Center, Department of Electrical Engineering, Viterbi School of Engineering, University of Southern California, 3740 McClintock Ave., Room 400, Los Angeles, CA 90089-2564, ...;-;-;-
Venue:
Computer Animation and Virtual Worlds - CASA 2005
Year:
2005

Citing 0
Cited 12

Animating blendshape faces by cross-mapping motion capture data

I3D '06 Proceedings of the 2006 symposium on Interactive 3D graphics and games
Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces

IEEE Transactions on Visualization and Computer Graphics
eFASE: expressive facial animation synthesis and editing with phoneme-isomap controls

Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation
Interactive 3D facial expression posing through 2D portrait manipulation

GI '08 Proceedings of graphics interface 2008
A model of gaze for the purpose of emotional expression in virtual embodied agents

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Towards Natural Head Movement of Autonomous Speaker Agent

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Decimation of human face model for real-time animation in intelligent multimedia systems

Multimedia Tools and Applications
On the importance of eye gaze in a face-to-face collaborative task

Proceedings of the 3rd international workshop on Affective interaction in natural environments
Mood avatar: automatic text-driven head motion synthesis

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Speech, gaze and head motion in a face-to-face collaborative task

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Perceptual analysis of talking avatar head movements: a quantitative perspective

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Perceiving visual emotions with speech

IVA'06 Proceedings of the 6th international conference on Intelligent Virtual Agents

Quantified Score

Hi-index	0.01

Visualization

Abstract

Natural head motion is important to realistic facial animation and engaging human–computer interactions. In this paper, we present a novel data-driven approach to synthesize appropriate head motion by sampling from trained hidden markov models (HMMs). First, while an actress recited a corpus specifically designed to elicit various emotions, her 3D head motion was captured and further processed to construct a head motion database that included synchronized speech information. Then, an HMM for each discrete head motion representation (derived directly from data using vector quantization) was created by using acoustic prosodic features derived from speech. Finally, first-order Markov models and interpolation techniques were used to smooth the synthesized sequence. Our comparison experiments and novel synthesis results show that synthesized head motions follow the temporal dynamic behavior of real human subjects. Copyright © 2005 John Wiley & Sons, Ltd.