Lip synchronization from Thai speech

Authors:
Thavesak Chuensaichol;Pizzanu Kanongchaiyos;Chai Wutiwiwatchai
Affiliations:
Chulalongkorn University;Chulalongkorn University;Human Language Technology Lab, NECTEC
Venue:
Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry
Year:
2011

Citing 5
Cited 0

Extraction of Visual Features for Lipreading

IEEE Transactions on Pattern Analysis and Machine Intelligence
FERSA: Lip-Synchronous Animation

ICSC '95 Proceedings of the Third International Computer Science Conference on Image Analysis Applications and Computer Graphics
A computer-animated tutor for spoken and written language learning

Proceedings of the 5th international conference on Multimodal interfaces
Speech driven facial animation

Proceedings of the 2001 workshop on Perceptive user interfaces
Thai Speech-Driven Facial Animation

CULTURE-COMPUTING '11 Proceedings of the 2011 Second International Conference on Culture and Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Lip synchronization in character animation is generally done in animation films and games, consuming workload and cost during the animation development process. In this paper, we focus on reducing the cost and workload in this process, and apply this technique for use with Thai speech. The main idea is to extract and capture a viseme from the video of a human talking and the phonemic scripts inside this video. First, this approach starts with separating the human talking video into two parts that contains the speech and frame sequence, then using the speech combined with a phonemic script to extract the time-stamp of each phoneme by using force-alignment techniques; next, we create a visyllable database by mapping the start time of each selected phoneme to an image; then, we capture the interested positions from the image to make a visyllable database; after that, we generate a talking head animation video by synchronizing a time-stamp of each phoneme to concatenated visemes. The results of experimental tests are reported, indicating good accuracy of the synchronized lip movement with the speech, compared to an artist-animated talking character.