Extraction of Visual Features for Lipreading
IEEE Transactions on Pattern Analysis and Machine Intelligence
FERSA: Lip-Synchronous Animation
ICSC '95 Proceedings of the Third International Computer Science Conference on Image Analysis Applications and Computer Graphics
A computer-animated tutor for spoken and written language learning
Proceedings of the 5th international conference on Multimodal interfaces
Speech driven facial animation
Proceedings of the 2001 workshop on Perceptive user interfaces
Thai Speech-Driven Facial Animation
CULTURE-COMPUTING '11 Proceedings of the 2011 Second International Conference on Culture and Computing
Hi-index | 0.00 |
Lip synchronization in character animation is generally done in animation films and games, consuming workload and cost during the animation development process. In this paper, we focus on reducing the cost and workload in this process, and apply this technique for use with Thai speech. The main idea is to extract and capture a viseme from the video of a human talking and the phonemic scripts inside this video. First, this approach starts with separating the human talking video into two parts that contains the speech and frame sequence, then using the speech combined with a phonemic script to extract the time-stamp of each phoneme by using force-alignment techniques; next, we create a visyllable database by mapping the start time of each selected phoneme to an image; then, we capture the interested positions from the image to make a visyllable database; after that, we generate a talking head animation video by synchronizing a time-stamp of each phoneme to concatenated visemes. The results of experimental tests are reported, indicating good accuracy of the synchronized lip movement with the speech, compared to an artist-animated talking character.