A new manifold representation for visual speech recognition

Authors:
Dahai Yu;Ovidiu Ghita;Alistair Sutherland;Paul F. Whelan
Affiliations:
School of Computing & Electronic Engineering, Vision Systems Group, Dublin City University, Dublin 9, Ireland;School of Computing & Electronic Engineering, Vision Systems Group, Dublin City University, Dublin 9, Ireland;School of Computing & Electronic Engineering, Vision Systems Group, Dublin City University, Dublin 9, Ireland;School of Computing & Electronic Engineering, Vision Systems Group, Dublin City University, Dublin 9, Ireland
Venue:
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Year:
2007

Citing 10
Cited 1

EM algorithms for PCA and SPCA

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Lip reading from scale-space measurements

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Nonlinear manifold learning for visual speech recognition

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
Gait Recognition by Two-Stage Principal Component Analysis

FGR '06 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition
Visual Speech Recognition Using Image Moments and Multiresolution Wavelet Images

CGIV '06 Proceedings of the International Conference on Computer Graphics, Imaging and Visualisation
A PCA Based Visual DCT Feature Extraction Method for Lip-Reading

IIH-MSP '06 Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia
A two-channel training algorithm for hidden Markov model and its application to lip reading

EURASIP Journal on Applied Signal Processing
Discriminative Analysis of Lip Motion Features for Speaker Identification and Speech-Reading

IEEE Transactions on Image Processing
Recognition of visual speech elements using adaptively boosted hidden Markov models

IEEE Transactions on Circuits and Systems for Video Technology
Accurate and quasi-automatic lip tracking

IEEE Transactions on Circuits and Systems for Video Technology

A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a new manifold representation capable of being applied for visual speech recognition. In this regard, the real time input video data is compressed using Principal Component Analysis (PCA) and the low-dimensional points calculated for each frame define the manifolds. Since the number of frames that from the video sequence is dependent on the word complexity, in order to use these manifolds for visual speech classification it is required to re-sample them into a fixed number of keypoints that are used as input for classification. In this paper two classification schemes, namely the k Nearest Neighbour (kNN) algorithm that is used in conjunction with the two-stage PCA and Hidden-Markov-Model (HMM) classifier are evaluated. The classification results for a group of English words indicate that the proposed approach is able to produce accurate classification results.