Morphological Shape Decomposition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature extraction from faces using deformable templates
International Journal of Computer Vision
Continuous automatic speech recognition by lipreading
Continuous automatic speech recognition by lipreading
Active shape models—their training and application
Computer Vision and Image Understanding
Automatic landmark generation for Point Distribution Models
BMVC 94 Proceedings of the conference on British machine vision (vol. 2)
Scale-Space From Nonlinear Filters
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiscale Nonlinear Decomposition: The Sieve Decomposition Theorem
IEEE Transactions on Pattern Analysis and Machine Intelligence
A technical introduction to digital video
A technical introduction to digital video
Speechreading using probabilistic models
Computer Vision and Image Understanding - Special issue on physics-based modeling and reasoning in computer vision
Scale-Space Theory in Computer Vision
Scale-Space Theory in Computer Vision
Motion-Based Recognition
Face Recognition Using Active Appearance Models
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
Nonlinear Scale-Space from n-Dimensional Sieves
ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume I - Volume I
Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications
ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
Lip reading from scale-space measurements
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Interpreting Face Images Using Active Appearance Models
FG '98 Proceedings of the 3rd. International Conference on Face & Gesture Recognition
Statistical Chromaticity-Based Lip Tracking with B-Splines
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
Audio-Visual Interaction in Multimedia Communication
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Automatic lipreading to enhance speech recognition (speech reading)
Automatic lipreading to enhance speech recognition (speech reading)
Audio-visual speech recognition: preprocessing, learning and sensory integration
Audio-visual speech recognition: preprocessing, learning and sensory integration
3D Modeling and Tracking of Human Lip Motions
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Accurate, Real-Time, Unadorned Lip Tracking
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Integrating audio and visual information to provide highly robust speech recognition
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Multiscale recursive medians, scale-space, and transforms with applications to image processing
IEEE Transactions on Image Processing
Articulatory features for robust visual speech recognition
Proceedings of the 6th international conference on Multimodal interfaces
Data Fusion and Multicue Data Matching by Diffusion Maps
IEEE Transactions on Pattern Analysis and Machine Intelligence
Recovering Facial Shape Using a Statistical Model of Surface Normal Direction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multimodal speaker/speech recognition using lip motion, lip texture and audio
Signal Processing - Special section: Multimodal human-computer interfaces
2D vs. 3D Deformable Face Models: Representational Power, Construction, and Real-Time Fitting
International Journal of Computer Vision
A two-channel training algorithm for hidden Markov model and its application to lip reading
EURASIP Journal on Applied Signal Processing
Local spatiotemporal descriptors for visual recognition of spoken phrases
Proceedings of the international workshop on Human-centered multimedia
Mouth center detection under active near infrared illumination
SIP'07 Proceedings of the 6th Conference on 6th WSEAS International Conference on Signal Processing - Volume 6
Visual recognition of speech consonants using facial movement features
Integrated Computer-Aided Engineering - Informatics in Control, Automation and Robotics
Combining Global and Local Classifiers for Lipreading
ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Real-Time Lip Contour Extraction and Tracking Using an Improved Active Contour Model
ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing, Part II
Audiovisual-to-articulatory inversion
Speech Communication
Japanese 45 Single Sounds Recognition Using Intraoral Shape
IEICE - Transactions on Information and Systems
Block-based motion estimation analysis for lip reading user authentication systems
WSEAS Transactions on Information Science and Applications
Motion estimation analysis for unsupervised training for lip reading user authentication systems
ICAI'09 Proceedings of the 10th WSEAS international conference on Automation & information
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Lipreading with local spatiotemporal descriptors
IEEE Transactions on Multimedia
Automatic visual feature extraction for mandarin audio-visual speech recognition
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Visual features extracting & selecting for lipreading
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Audio-visual speaker identification based on the use of dynamic audio and visual features
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
An intelligent multimedia E-learning system for pronunciations
IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
Person identification using lip motion sequence
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
A constrained optimization approach for an adaptive generalized subspace tracking algorithm
Computers and Electrical Engineering
Intelligent wheelchair multi-modal human-machine interfaces in lip contour extraction based on PMM
ROBIO'09 Proceedings of the 2009 international conference on Robotics and biomimetics
Automatic segmentation of color lip images based on morphological filter
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part I
Real-time lip reading system for isolated Korean word recognition
Pattern Recognition
Vowel recognition by using the combination of Haar wavelet and neural network
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
Comparative analysis of lip features for person identification
Proceedings of the 8th International Conference on Frontiers of Information Technology
Lip synchronization from Thai speech
Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry
An information acquiring channel —— lip movement
ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Attractor-Guided particle filtering for lip contour tracking
ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
Mapping from speech to images using continuous state space models
MLMI'04 Proceedings of the First international conference on Machine Learning for Multimodal Interaction
AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
Lip reading based on sampled active contour model
ICIAR'05 Proceedings of the Second international conference on Image Analysis and Recognition
Single image estimation of facial albedo maps
BVAI'05 Proceedings of the First international conference on Brain, Vision, and Artificial Intelligence
iFeeling: vibrotactile rendering of human emotions on mobile phones
Mobile Multimedia Processing
Lip localization based on active shape model and gaussian mixture model
PSIVT'06 Proceedings of the First Pacific Rim conference on Advances in Image and Video Technology
A local region based approach to lip tracking
Pattern Recognition
Lipreading procedure for liveness verification in video authentication systems
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
n-Gram modeling of relevant features for lip-reading
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Towards a visual speech learning system for the deaf by matching dynamic lip shapes
ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part I
Lip peripheral motion for visual surveillance
Proceedings of the Fifth International Conference on Security of Information and Networks
LUI: lip in multimodal mobile GUI interaction
Proceedings of the 14th ACM international conference on Multimodal interaction
Integration of face detection and user identification with visual speech recognition
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
The Visual Computer: International Journal of Computer Graphics
Optical Memory and Neural Networks
Hi-index | 0.17 |
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations and other body motion, such as those of the head, convey additional information. We integrate speech cues from many sources and this improves intelligibility, especially when the acoustic signal is degraded. This paper shows how this additional, often complementary, visual speech information can be used for speech recognition. Three methods for parameterizing lip image sequences for recognition using hidden Markov models are compared. Two of these are top-down approaches that fit a model of the inner and outer lip contours and derive lipreading features from a principal component analysis of shape or shape and appearance, respectively. The third, bottom-up, method uses a nonlinear scale-space analysis to form features directly from the pixel intensity. All methods are compared on a multitalker visual speech recognition task of isolated letters.