Speech driven facial animation

  • Authors:
  • P. Kakumanu;R. Gutierrez-Osuna;A. Esposito;R. Bryll;A. Goshtasby;O. N. Garcia

  • Affiliations:
  • Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH;Wright State University, Dayton, OH

  • Venue:
  • Proceedings of the 2001 workshop on Perceptive user interfaces
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The results reported in this article are an integral part of a larger project aimed at achieving perceptually realistic animations, including the individualized nuances, of three-dimensional human faces driven by speech. The audiovisual system that has been developed for learning the spatio-temporal relationship between speech acoustics and facial animation is described, including video and speech processing, pattern analysis, and MPEG-4 compliant facial animation for a given speaker. In particular, we propose a perceptual transformation of the speech spectral envelope, which is shown to capture the dynamics of articulatory movements. An efficient nearest-neighbor algorithm is used to predict novel articulatory trajectories from the speech dynamics. The results are very promising and suggest a new way to approach the modeling of synthetic lip motion of a given speaker driven by his/her speech. This would also provide clues toward a more general cross-speaker realistic animation.