A Temporal Network of Support Vector Machine Classifiers for the Recognition of Visual Speech

Authors:
Mihaela Gordan;Constantine Kotropoulos;Ioannis Pitas
Affiliations:
-;-;-
Venue:
SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Year:
2002

Citing 9
Cited 0

Speechreading using probabilistic models

Computer Vision and Image Understanding - Special issue on physics-based modeling and reasoning in computer vision
Making large-scale support vector machine learning practical

Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
MikeTalk: A Talking Facial Display Based on Morphing Visemes

CA '98 Proceedings of the Computer Animation
Learning-Based Approach to Real Time Tracking and Analysis of Faces

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Support Vector Regression and Classification Based Multi-View Face Detection and Recognition

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Nonlinear manifold learning for visual speech recognition

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
A Pattern Classification Approach to Dynamical Object Detection

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Invariant Face Detection with Support Vector Machines

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speech recognition based on visual information is an emerging research field. We propose here a new system for the recognition of visual speech based on support vector machines which proved to be powerful classifiers in other visual tasks. We use support vector machines to recognize the mouth shape corresponding to different phones produced. To model the temporal character of the speech we employ the Viterbi decoding in a network of support vector machines. The recognition rate obtained is higher than those reported earlier when the same features were used. The proposed solution offers the advantage of an easy generalization to large vocabulary recognition tasks due to the use of viseme models, as opposed to entire word models.