The JPEG still picture compression standard
Communications of the ACM - Special issue on digital multimedia systems
Speechreading using probabilistic models
Computer Vision and Image Understanding - Special issue on physics-based modeling and reasoning in computer vision
Visual Speech: A Physiological or Behavioural Biometric?
AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
The design for the wall street journal-based CSR corpus
HLT '91 Proceedings of the workshop on Speech and Natural Language
A new lip feature representation method for video-based bimodal authentication
MMUI '05 Proceedings of the 2005 NICTA-HCSNet Multimodal User Interaction Workshop - Volume 57
Multimodal speaker/speech recognition using lip motion, lip texture and audio
Signal Processing - Special section: Multimodal human-computer interfaces
Audio-visual person authentication using lip-motion from orientation maps
Pattern Recognition Letters
Audio-visual speaker verification using continuous fused HMMs
VisHCI '06 Proceedings of the HCSNet workshop on Use of vision in human-computer interaction - Volume 56
Score normalization in multimodal biometric systems
Pattern Recognition
The BANCA database and evaluation protocol
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Audio-visual speaker identification based on the use of dynamic audio and visual features
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
A Bayesian approach to audio-visual speaker identification
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
A fused hidden Markov model with application to bimodal speech processing
IEEE Transactions on Signal Processing
A review of speech-based bimodal recognition
IEEE Transactions on Multimedia
Combining dynamic texture and structural features for speaker identification
Proceedings of the 2nd ACM workshop on Multimedia in forensics, security and intelligence
Hi-index | 0.00 |
The cascading appearance-based (CAB) feature extraction technique has established itself as the state-of-the-art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we will demonstrate that the same steps taken to reduce static speaker and environmental information for the visual speech recognition application also provide similar improvements for visual speaker recognition. A further study is conducted comparing synchronous HMM (SHMM) based fusion of CAB visual features and traditional perceptual linear predictive (PLP) acoustic features to show that higher complexity inherit in the SHMM approach does not appear to provide any improvement in the final audio-visual speaker verification system over simpler utterance level score fusion.