Speech driven facial animation

Authors:
P. Kakumanu;R. Gutierrez-Osuna;A. Esposito;R. Bryll;A. Goshtasby;O. N. Garcia
Affiliations:
Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH
Venue:
Proceedings of the 2001 workshop on Perceptive user interfaces
Year:
2001

Citing 8
Cited 7

Fundamentals of speech recognition

Fundamentals of speech recognition
Quantitative association of vocal-tract and facial behavior

Speech Communication - Special issue on auditory-visual speech processing
Lip movement synthesis from speech based on hidden Markov models

Speech Communication - Special issue on auditory-visual speech processing
Voice puppetry

Proceedings of the 26th annual conference on Computer graphics and interactive techniques
Codebook based face point trajectory synthesis algorithm using speech input

Speech Communication
Sample-Based Synthesis of Photo-Realistic Talking Heads

CA '98 Proceedings of the Computer Animation
Animation of Synthetic Faces in MPEG-4

CA '98 Proceedings of the Computer Animation
The facial animation engine: toward a high-level interface for the design of MPEG-4 compliant animated faces

IEEE Transactions on Circuits and Systems for Video Technology

Conversion of continuous speech sound to articulation animation as an application of visual coarticulation modeling

Acta Cybernetica
Ekfrasis: A Formal Language for Representing and Generating Sequences of Facial Patterns for Studying Emotional Behavior

Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
Cultural Specific Effects on the Recognition of Basic Emotions: A Study on Italian Subjects

USAB '09 Proceedings of the 5th Symposium of the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society on HCI and Usability for e-Inclusion
Lip synchronization from Thai speech

Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry
On speech and gestures synchrony

COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
A cross-cultural study on the perception of emotions: how hungarian subjects evaluate american and italian emotional expressions

COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
Detecting Facial Expressions for Monitoring Patterns of Emotional Behavior

International Journal of Monitoring and Surveillance Technologies Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The results reported in this article are an integral part of a larger project aimed at achieving perceptually realistic animations, including the individualized nuances, of three-dimensional human faces driven by speech. The audiovisual system that has been developed for learning the spatio-temporal relationship between speech acoustics and facial animation is described, including video and speech processing, pattern analysis, and MPEG-4 compliant facial animation for a given speaker. In particular, we propose a perceptual transformation of the speech spectral envelope, which is shown to capture the dynamics of articulatory movements. An efficient nearest-neighbor algorithm is used to predict novel articulatory trajectories from the speech dynamics. The results are very promising and suggest a new way to approach the modeling of synthetic lip motion of a given speaker driven by his/her speech. This would also provide clues toward a more general cross-speaker realistic animation.