About neural-network algorithms application in viseme classification problem with face video in audiovisual speech recognition systems

Authors:
A. V. Savchenko;Ya. I. Khokhlova
Affiliations:
Department of Business Informatics and Applied Mathematics, National Research University High School of Economics, Nizhni Novgorod, Russia;Department of Business Informatics and Applied Mathematics, National Research University High School of Economics, Nizhni Novgorod, Russia
Venue:
Optical Memory and Neural Networks
Year:
2014

Citing 20
Cited 0

Support-Vector Networks

Machine Learning
Neural networks for pattern recognition

Neural networks for pattern recognition
Extraction of Visual Features for Lipreading

IEEE Transactions on Pattern Analysis and Machine Intelligence
Active Contours: The Application of Techniques from Graphics,Vision,Control Theory and Statistics to Visual Tracking of Shapes in Motion

Active Contours: The Application of Techniques from Graphics,Vision,Control Theory and Statistics to Visual Tracking of Shapes in Motion
Computer Vision

Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A fast learning algorithm for deep belief nets

Neural Computation
Springer Handbook of Speech Processing

Springer Handbook of Speech Processing
Speeded-Up Robust Features (SURF)

Computer Vision and Image Understanding
Pattern Recognition, Fourth Edition

Pattern Recognition, Fourth Edition
Audio-visual speaker identification using dynamic facial movements and utterance phonetic content

Applied Soft Computing
An assistive bi-modal user interface integrating multi-channel speech recognition and computer vision

HCII'11 Proceedings of the 14th international conference on Human-computer interaction: interaction techniques and environments - Volume Part II
Dialog model development of a mobile information and reference robot

Pattern Recognition and Image Analysis
Directed enumeration method in image recognition

Pattern Recognition
Audio-visual speech modeling for continuous speech recognition

IEEE Transactions on Multimedia
Adaptive video image recognition system using a committee machine

Optical Memory and Neural Networks
Phonetic words decoding software in the problem of Russian speech recognition

Automation and Remote Control
Probabilistic neural network with homogeneity testing in recognition of discrete patterns set

Neural Networks
Pattern recognition and increasing of the computational efficiency of a parallel realization of the probabilistic neural network with homogeneity testing

Optical Memory and Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper considers the phoneme recognition by facial expressions of a speaker in voice-activated control systems. We have developed a neural network recognition algorithm by using the phonetic words decoding method and the requirement for isolated syllable pronunciation of voice commands. The paper presents the experimental results of viseme (facial and lip position corresponding to a particular phoneme) classification of Russian vowels. We show the dependence of the classification accuracy on the used classifier (multilayer feed-forward network, support vector machine, k-nearest neighbor method), image features (histogram of oriented gradients, eigenvectors, SURF local descriptors) and the type of camera (built-in or Kinect one). The best accuracy of speaker-dependent recognition is shown to be 85% for a built-in camera and 96% for Kinect depth maps when the classification is performed with the histogram of oriented gradients and the support vector machine.