A singer identification technique for content-based classification of MP3 music objects
Proceedings of the eleventh international conference on Information and knowledge management
Multimodal speaker/speech recognition using lip motion, lip texture and audio
Signal Processing - Special section: Multimodal human-computer interfaces
Audio-visual person authentication using lip-motion from orientation maps
Pattern Recognition Letters
Audio-visual speech processing: progress and challenges
VisHCI '06 Proceedings of the HCSNet workshop on Use of vision in human-computer interaction - Volume 56
Audio-visual speaker verification using continuous fused HMMs
VisHCI '06 Proceedings of the HCSNet workshop on Use of vision in human-computer interaction - Volume 56
Audiovisual speech synchrony measure: application to biometrics
EURASIP Journal on Applied Signal Processing
Synergy of Lip-Motion and Acoustic Features in Biometric Speech and Speaker Recognition
IEEE Transactions on Computers
Proceedings of the 9th international conference on Multimodal interfaces
Multimodal person authentication using speech, face and visual speech
Computer Vision and Image Understanding
MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation
Multimodal Technologies for Perception of Humans
Biometric person authentication with liveness detection based on audio-visual fusion
International Journal of Biometrics
A method towards biometric feature fusion
International Journal of Biometrics
Dynamic visual features for audio-visual speaker verification
Computer Speech and Language
Improving speech recognition on a mobile robot platform through the use of top-down visual queues
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Design and implementation of a lip reading system in smart phone environment
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Feature Fusion Applied to Missing Data ASR with the Combination of Recognizers
Journal of Signal Processing Systems
Automatic visual feature extraction for mandarin audio-visual speech recognition
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Automatic lip contour extraction from color images
Pattern Recognition
Audio-visual speaker identification based on the use of dynamic audio and visual features
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
A Bayesian approach to audio-visual speaker identification
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Lip biometrics for digit recognition
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on gait analysis
Multimedia sensor fusion for retrieving identity in biometric access control systems
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Identity retrieval in biometric access control systems using multimedia fusion
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II
Speech recognition by integrating audio, visual and contextual features based on neural networks
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
Speech recognition with multi-modal features based on neural networks
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Multi-level fusion of audio and visual features for speaker identification
ICB'06 Proceedings of the 2006 international conference on Advances in Biometrics
VALID: a new practical audio-visual database, and comparative results
AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
International Journal of Biometrics
Journal of Signal Processing Systems
Lipreading procedure based on dynamic programming
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part I
Speaker and digit recognition by audio-visual lip biometrics
ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics
Audio visual person authentication by multiple nearest neighbor classifiers
ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics
Biometric fusion by simulated annealing
International Journal of Knowledge-based and Intelligent Engineering Systems
Hi-index | 0.00 |
Speech recognition and speaker recognition by machine are crucial ingredients for many important applications such as natural and flexible human-machine interfaces. Most developments in speech-based automatic recognition have relied on acoustic speech as the sole input signal, disregarding its visual counterpart. However, recognition based on acoustic speech alone can be afflicted with deficiencies that preclude its use in many real-world applications, particularly under adverse conditions. The combination of auditory and visual modalities promises higher recognition accuracy and robustness than can be obtained with a single modality. Multimodal recognition is therefore acknowledged as a vital component of the next generation of spoken language systems. The paper reviews the components of bimodal recognizers, discusses the accuracy of bimodal recognition, and highlights some outstanding research issues as well as possible application domains