Video Rewrite: driving visual speech with audio
Proceedings of the 24th annual conference on Computer graphics and interactive techniques
Speaker independence in automated lip-sync for audio-video communication
Computer Networks and ISDN Systems - Special issue on graphics research and education on the World Wide Web
Proceedings of the 26th annual conference on Computer graphics and interactive techniques
Extraction of Visual Features for Lipreading
IEEE Transactions on Pattern Analysis and Machine Intelligence
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
MikeTalk: A Talking Facial Display Based on Morphing Visemes
CA '98 Proceedings of the Computer Animation
Audio-visual speech modeling for continuous speech recognition
IEEE Transactions on Multimedia
An HMM-based speech-to-video synthesizer
IEEE Transactions on Neural Networks
State-Space Models: From the EM Algorithm to a Gradient Approach
Neural Computation
Learning active appearance models from image sequences
VisHCI '06 Proceedings of the HCSNet workshop on Use of vision in human-computer interaction - Volume 56
Learning AAM fitting through simulation
Pattern Recognition
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Hi-index | 0.00 |
In this paper a system that transforms speech waveforms to animated faces are proposed. The system relies on continuous state space models to perform the mapping, this makes it possible to ensure video with no sudden jumps and allows continuous control of the parameters in 'face space'. The performance of the system is critically dependent on the number of hidden variables, with too few variables the model cannot represent data, and with too many overfitting is noticed Simulations are performed on recordings of 3-5 sec. video sequences with sentences from the Timit database. From a subjective point of view the model is able to construct an image sequence from an unknown noisy speech sequence even though the number of training examples are limited.