Robust AAM-based audio-visual speech recognition against face direction changes

Authors:
Yuto Komai;Nan Yang;Tetsuya Takiguchi;Yasuo Ariki
Affiliations:
Kobe University, Kobe, Japan;Kobe University, Kobe, Japan;Kobe University, Kobe, Japan;Kobe University, Kobe, Japan
Venue:
Proceedings of the 20th ACM international conference on Multimedia
Year:
2012

Citing 7
Cited 0

Active Appearance Models

ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
Integrating audio and visual information to provide highly robust speech recognition

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
A new optimization procedure for extracting the point-based lip contour using active shape model

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03
Automatic Extraction of Lip Based on Wavelet Edge Detection

SYNASC '06 Proceedings of the Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
Research on Visual Speech Feature Extraction

ICCET '09 Proceedings of the 2009 International Conference on Computer Engineering and Technology - Volume 02
Wrapping snakes for improved lip segmentation

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Research of Visual Features Detection and Tracking Methods about Audio-Visual Bimodal Speech Recognition

IFITA '10 Proceedings of the 2010 International Forum on Information Technology and Applications - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

As one of the techniques for robust speech recognition under noisy environments, audio-visual speech recognition (AVSR) using lip dynamic scene information together with audio information is attracting attention, and the research has advanced in recent years. However, in visual speech recognition (VSR), when a face turns sideways, the shape of the lip as viewed from the camera changes and the recognition accuracy degrades significantly. Therefore, many of the conventional VSR methods are limited to situations in which the face is viewed from the front. This paper proposes a VSR method to convert faces viewed from various directions into faces that are viewed from the front using Active Appearance Models (AAM). In the experiment, even when the face direction changes about 30 degrees relative to a frontal view, the recognition accuracy improved significantly.