Intelligent content production for a virtual speaker

  • Authors:
  • Karlo Smid;Igor S. Pandzic;Viktorija Radman

  • Affiliations:
  • Ericsson Nikola Tesla, Zagreb;Faculty of electrical engineering and computing, Zagreb University, Zagreb;Ericsson Nikola Tesla, Zagreb

  • Venue:
  • IMTCI'04 Proceedings of the Second international conference on Intelligent Media Technology for Communicative Intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a graphically embodied animated agent (a virtual speaker) capable of reading a plain English text and rendering it in a form of speech accompanied by the appropriate facial gestures. Our system uses a lexical analysis of an English text and statistical models of facial gestures in order to automatically generate the gestures related to the spoken text. It is intended for the automatic creation of the realistically animated virtual speakers, such as newscasters and storytellers and incorporates the characteristics of such speakers captured from the training video clips. Our system is based on a visual text-to-speech system which generates a lip movement synchronised with the generated speech. This is extended to include eye blinks, head and eyebrow motion, and a simple gaze following behaviour. The result is a full face animation produced automatically from the plain English text.