Extraction of Visual Features for Lipreading
IEEE Transactions on Pattern Analysis and Machine Intelligence
Face Recognition: Features Versus Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Person Identification Using Multiple Cues
IEEE Transactions on Pattern Analysis and Machine Intelligence
A review of speech-based bimodal recognition
IEEE Transactions on Multimedia
Person identification using automatic integration of speech, lip, and face experts
WBMA '03 Proceedings of the 2003 ACM SIGMM workshop on Biometrics methods and applications
Audiovisual speech synchrony measure: application to biometrics
EURASIP Journal on Applied Signal Processing
A method towards biometric feature fusion
International Journal of Biometrics
Dynamic visual features for audio-visual speaker verification
Computer Speech and Language
Robust automatic human identification using face, mouth, and acoustic information
AMFG'05 Proceedings of the Second international conference on Analysis and Modelling of Faces and Gestures
VALID: a new practical audio-visual database, and comparative results
AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
Biometric fusion by simulated annealing
International Journal of Knowledge-based and Intelligent Engineering Systems
Hi-index | 0.00 |
This paper presents a speaker identification system based on dynamical features of both the audio and visual modes. Speakers are modeled using a text dependent HMM methodology. Early and late audio-visual integration are investigated. Experiments are carried out for 252 speakers from the XM2VTS database. From our experimental results, it has been shown that the addition of the dynamical visual information improves the speaker identification accuracies for both clean and noisy audio conditions compared to the audio only case. The best audio, visual and audio-visual identification accuracies achieved were 86.91%, 57.14% and 94.05% respectively.