Realistic Face Animation for Audiovisual Speech Applications: A Densification Approach Driven by Sparse Stereo Meshes

Authors:
Marie-Odile Berger;Jonathan Ponroy;Brigitte Wrobel-Dautcourt
Affiliations:
LORIA/INRIA Nancy Grand Est,;LORIA/INRIA Nancy Grand Est,;LORIA/INRIA Nancy Grand Est,
Venue:
MIRAGE '09 Proceedings of the 4th International Conference on Computer Vision/Computer Graphics CollaborationTechniques
Year:
2009

Citing 5
Cited 1

EM algorithms for PCA and SPCA

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
A morphable model for the synthesis of 3D faces

Proceedings of the 26th annual conference on Computer graphics and interactive techniques
Trainable videorealistic speech animation

Proceedings of the 29th annual conference on Computer graphics and interactive techniques
A Robust PCA Algorithm for Building Representations from Panoramic Images

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Deformation transfer for triangle meshes

ACM SIGGRAPH 2004 Papers

Salient and non-salient fiducial detection using a probabilistic graphical model

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Being able to produce realistic facial animation is crucial for many speech applications in language learning technologies. Reaching realism needs to acquire and to animate dense 3D models of the face which are often acquired with 3D scanners. However, acquiring the dynamics of the speech from 3D scans is difficult as the acquisition time generally allows only sustained sounds to be recorded. On the contrary, acquiring the speech dynamics on a sparse set of points is easy using a stereovision recording a talker with markers painted on his/her face. In this paper, we propose an approach to animate a very realistic dense talking head which makes use of a reduced set of 3D dense meshes acquired for sustained sounds as well as the speech dynamics learned on a talker painted with white markers. The contributions of the paper are twofold: We first propose an appropriate principal component analysis (PCA) with missing data techniques in order to compute the basic modes of the speech dynamics despite possible unobservable points in the sparse meshes obtained by the stereovision system. We then propose a method for densifying the modes, that is a method for computing the dense modes for spatial animation from the sparse modes learned by the stereovision system. Examples prove the effectiveness of the approach and the high realism obtained with our method.