Visual Speech Recognition Using Image Moments and Multiresolution Wavelet Images

Authors:
Wai C. Yau;Dinesh K. Kumar;Sridhar P. Arjunan;Sanjay Kumar
Affiliations:
RMIT University, Australia;RMIT University, Australia;RMIT University, Australia;RMIT University, Australia
Venue:
CGIV '06 Proceedings of the International Conference on Computer Graphics, Imaging and Visualisation
Year:
2006

Citing 0
Cited 3

Visual recognition of speech consonants using facial movement features

Integrated Computer-Aided Engineering - Informatics in Control, Automation and Robotics
A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology
A new manifold representation for visual speech recognition

CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a new technique for recognizing speech using visual speech information. The video data of the speaker's mouth is represented using grayscale images named as motion history image (MHI). MHI is generated by applying accumulative image differencing on the frames of the video to implicitly represent the temporal information of the mouth movement. The MHIs are decomposed into wavelet sub images using Discrete Stationary Wavelet Transform (SWT). Three moment-based features (geometric moments, Zernike moments and Hu moments) are extracted from the SWT approximate sub images. Multilayer perceptron (MLP) type artificial neural network (ANN) with back propagation learning algorithm is used to classify the moments features. This paper evaluates and compares the image representation ability of the different moments. The initial experiments show that this method can classify English consonants with an error rate less than 5%.