Visual speech synthesis by modelling coarticulation dynamics using a non-parametric switching state-space model

  • Authors:
  • Salil Deena;Shaobo Hou;Aphrodite Galata

  • Affiliations:
  • University of Manchester, UK;University of Manchester, UK;University of Manchester, UK

  • Venue:
  • International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present a novel approach to speech-driven facial animation using a non-parametric switching state space model based on Gaussian processes. The model is an extension of the shared Gaussian process dynamical model, augmented with switching states. Audio and visual data from a talking head corpus are jointly modelled using the proposed method. The switching states are found using variable length Markov models trained on labelled phonetic data. We also propose a synthesis technique that takes into account both previous and future phonetic context, thus accounting for coarticulatory effects in speech.