Virtual character performance from speech

Authors:
Stacy Marsella;Yuyu Xu;Margaux Lhommet;Andrew Feng;Stefan Scherer;Ari Shapiro
Affiliations:
-;-;-;-;-;-
Venue:
Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation
Year:
2013

Citing 28
Cited 2

Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents

SIGGRAPH '94 Proceedings of the 21st annual conference on Computer graphics and interactive techniques
WordNet: a lexical database for English

Communications of the ACM
Voice puppetry

Proceedings of the 26th annual conference on Computer graphics and interactive techniques
BEAT: the Behavior Expression Animation Toolkit

Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Improving noise

Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Unsupervised learning for speech motion editing

Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Speaking with hands: creating animated conversational characters from recordings of human performance

ACM SIGGRAPH 2004 Papers
Mood swings: expressive speech animation

ACM Transactions on Graphics (TOG)
Specifying and animating facial signals for discourse in embodied conversational agents: Research Articles

Computer Animation and Virtual Worlds
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces

IEEE Transactions on Visualization and Computer Graphics
Gesture modeling and animation based on a probabilistic re-creation of speaker style

ACM Transactions on Graphics (TOG)
SmartBody: behavior realization for embodied conversational agents

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Towards Natural Gesture Synthesis: Evaluating Gesture Units in a Data-Driven Approach to Gesture Synthesis

IVA '07 Proceedings of the 7th international conference on Intelligent Virtual Agents
The CereVoice Characterful Speech Synthesiser SDK

IVA '07 Proceedings of the 7th international conference on Intelligent Virtual Agents
Increasing the expressiveness of virtual agents: autonomous generation of speech and gesture for spatial description tasks

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Greta: an interactive expressive ECA system

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
EMBR --- A Realtime Animation Engine for Interactive Embodied Agents

IVA '09 Proceedings of the 9th International Conference on Intelligent Virtual Agents
Real-time prosody-driven synthesis of body language

ACM SIGGRAPH Asia 2009 papers
Gesture controllers

ACM SIGGRAPH 2010 papers
Towards a common framework for multimodal generation: the behavior markup language

IVA'06 Proceedings of the 6th international conference on Intelligent Virtual Agents
Nonverbal behavior generator for embodied conversational agents

IVA'06 Proceedings of the 6th international conference on Intelligent Virtual Agents
Building a character animation system

MIG'11 Proceedings of the 4th international conference on Motion in Games
Rigid Head Motion in Expressive Speech Animation: Analysis and Synthesis

IEEE Transactions on Audio, Speech, and Language Processing
Learning dynamic audio-visual mapping with input-output Hidden Markov models

IEEE Transactions on Multimedia
Predicting Speaker Head Nods and the Effects of Affective Information

IEEE Transactions on Multimedia
Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification

Computer Speech and Language

Towards higher quality character performance in previz

Proceedings of the Symposium on Digital Production
Mobile personal healthcare mediated by virtual humans

Proceedings of the companion publication of the 19th international conference on Intelligent User Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

We demonstrate a method for generating a 3D virtual character performance from the audio signal by inferring the acoustic and semantic properties of the utterance. Through a prosodic analysis of the acoustic signal, we perform an analysis for stress and pitch, relate it to the spoken words and identify the agitation state. Our rule-based system performs a shallow analysis of the utterance text to determine its semantic, pragmatic and rhetorical content. Based on these analyses, the system generates facial expressions and behaviors including head movements, eye saccades, gestures, blinks and gazes. Our technique is able to synthesize the performance and generate novel gesture animations based on coarticulation with other closely scheduled animations. Because our method utilizes semantics in addition to prosody, we are able to generate virtual character performances that are more appropriate than methods that use only prosody. We perform a study that shows that our technique outperforms methods that use prosody alone.