The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
Video Rewrite: driving visual speech with audio
Proceedings of the 24th annual conference on Computer graphics and interactive techniques
Proceedings of the 26th annual conference on Computer graphics and interactive techniques
Learning variable-length Markov models of behavior
Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
IEEE Transactions on Pattern Analysis and Machine Intelligence
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Trainable videorealistic speech animation
Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Design of a linguistic postprocessor using variable memory length Markov models
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Multidimensional Morphable Models
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Real-time speech motion synthesis from recorded motions
SCA '04 Proceedings of the 2004 ACM SIGGRAPH/Eurographics symposium on Computer animation
3D People Tracking with Gaussian Process Dynamical Models
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models
The Journal of Machine Learning Research
A coupled HMM approach to video-realistic speech animation
Pattern Recognition
Gaussian Process Dynamical Models for Human Motion
IEEE Transactions on Pattern Analysis and Machine Intelligence
Ambiguity Modeling in Latent Spaces
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Speech-Driven Facial Animation Using a Shared Gaussian Process Latent Variable Model
ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
Gaussian process latent variable models for human pose estimation
MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Mapping from speech to images using continuous state space models
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
Hi-index | 0.01 |
We present a novel approach to speech-driven facial animation using a non-parametric switching state space model based on Gaussian processes. The model is an extension of the shared Gaussian process dynamical model, augmented with switching states. Audio and visual data from a talking head corpus are jointly modelled using the proposed method. The switching states are found using variable length Markov models trained on labelled phonetic data. We also propose a synthesis technique that takes into account both previous and future phonetic context, thus accounting for coarticulatory effects in speech.