WinkTalk: a demonstration of a multimodal speech synthesis platform linking facial expressions to expressive synthetic voices

  • Authors:
  • Éva Székely;Zeeshan Ahmed;João P. Cabral;Julie Carson-Berndsen

  • Affiliations:
  • University College Dublin, Dublin, Ireland;University College Dublin, Dublin, Ireland;University College Dublin, Dublin, Ireland;University College Dublin, Dublin, Ireland

  • Venue:
  • SLPAT '12 Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a demonstration of the WinkTalk system, which is a speech synthesis platform using expressive synthetic voices. With the help of a webcamera and facial expression analysis, the system allows the user to control the expressive features of the synthetic speech for a particular utterance with their facial expressions. Based on a personalised mapping between three expressive synthetic voices and the users facial expressions, the system selects a voice that matches their face at the moment of sending a message. The WinkTalk system is an early research prototype that aims to demonstrate that facial expressions can be used as a more intuitive control over expressive speech synthesis than manual selection of voice types, thereby contributing to an improved communication experience for users of speech generating devices.