A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition
PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology
Clustering Persian viseme using phoneme subspace for developing visual speech application
Multimedia Tools and Applications
Hi-index | 0.00 |
Automatic lipreading is automatic speech recognition that uses only visual information. The relevant data in a video signal is isolated and features are extracted from it. From a sequence of feature vectors, where every vector represents one video image, a sequence of higher level semantic elements is formed. These semantic elements are "visemes" the visual equivalent of "phonemes" The developed prototype uses a Time Delayed Neural Network to classify the visemes.