Speech Animation Using Coupled Hidden Markov Models

Authors:
Lei Xie;Zhi-Qiang Liu
Affiliations:
City University of Hong Kong, Hong Kong, China;City University of Hong Kong, Hong Kong, China
Venue:
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
Year:
2006

Citing 0
Cited 1

Realistic visual speech synthesis based on hybrid concatenation method

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel speech animation approach using coupled hidden Markov models (CHMMs). Different from the conventional HMMs that use a single state chain to model the audio-visual speech with tight inter-modal synchronization, we use the CHMMs to model the asynchrony, different discriminative abilities, and temporal coupling between the audio speech and the visual speech, which are important factors for animations looking natural. Based on the audio-visual CHMMs, visual animation parameters are predicted from audio through an EM-based audio to visual conversion algorithm. Experiments on the JEWEL AV database show that compared with the conventional HMMs, the CHMMs can output visual parameters that are much closer to the actual ones. Explicit modelling of audio-visual speech is promising in speech animation.