Classifying Visemes for Automatic Lipreading

Authors:
Michiel Visser;Mannes Poel;Anton Nijholt
Affiliations:
-;-;-
Venue:
TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Year:
1999

Citing 0
Cited 3

A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology
Comprehensive many-to-many phoneme-to-viseme mapping and its application for concatenative visual speech synthesis

Speech Communication
Clustering Persian viseme using phoneme subspace for developing visual speech application

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic lipreading is automatic speech recognition that uses only visual information. The relevant data in a video signal is isolated and features are extracted from it. From a sequence of feature vectors, where every vector represents one video image, a sequence of higher level semantic elements is formed. These semantic elements are "visemes" the visual equivalent of "phonemes" The developed prototype uses a Time Delayed Neural Network to classify the visemes.