Robust Sensor Fusion: Analysis and Application to Audio Visual Speech Recognition
Machine Learning - Special issue on context sensitivity and concept drift
Statistical Language Learning
Speechreading by Man and Machine: Models, Systems, and Applications
Speechreading by Man and Machine: Models, Systems, and Applications
Continuous Audio-Visual Speech Recognition
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
WACV '96 Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96)
An Approach to Statistical Lip Modelling for Speaker Identification via Chromatic Feature Extraction
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Adaptive bimodal sensor fusion for automatic speechreading
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
A two-channel training algorithm for hidden Markov model and its application to lip reading
EURASIP Journal on Applied Signal Processing
Real-time robot manipulation using mouth gestures in facial video sequences
BVAI'07 Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence
Intelligent wheelchair multi-modal human-machine interfaces in lip contour extraction based on PMM
ROBIO'09 Proceedings of the 2009 international conference on Robotics and biomimetics
Robust lip segmentation method based on level set model
PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Estimation of the area of mouth opening during speech production
Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Hi-index | 0.00 |
Automatic speech recognition (ASR) performs well under restricted conditions, but performance degrades in noisy environments. Audio-Visual Speech Recognition (AVSR) combats this by incorporating a visual signal into the recognition. This paper briefly reviews the contribution of psycholinguistics to this endeavour and the recent advances in machine AVSR. An important first step in AVSR is that of feature extraction from the mouth region. This paper examines several well-known pixel based techniques - grayscale, horizontal edge, red and hue colour space - and compares how well they work on our naturalistic database. Finally, a novel method of feature extraction, red exclusion, is described that outperforms the others on this data set.