Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Using Discriminant Eigenfeatures for Image Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Towards a Northern Sotho talking head
AFRIGRAPH '07 Proceedings of the 5th international conference on Computer graphics, virtual reality, visualisation and interaction in Africa
Clustering Persian viseme using phoneme subspace for developing visual speech application
Multimedia Tools and Applications
Hi-index | 0.00 |
Real time classification algorithms are presented for visual mouth appearances (visemes) which correspond to phonemes and their speech contexts. They are used at the design of talking head application. Two feature extraction procedures were verified. The first one is based on the normalized triangle mesh covering mouth area and the color image texture vector indexed by barycentric coordinates. The second procedure performs Discrete Fourier Transform on the image rectangle including mouth w.r.t. a small block of DFT coefficients. The classifier has been designed by the optimized LDA method which uses two singular subspace approach. Despite of higher computational complexity (about three milliseconds per video frame on Pentium IV 3.2GHz), the DFT+LDA approach has practical advantages over MESH+LDA classifier. Firstly, it is better in recognition rate more than two percent (97.2% versus 99.3%). Secondly, the automatic identification of the covering mouth rectangle is more robust than the automatic identification of the covering mouth triangle mesh.