The M2VTS Multimodal Face Database (Release 1.00)
AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Information Fusion in Biometrics
AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
Audio-visual speech modeling for continuous speech recognition
IEEE Transactions on Multimedia
Toward adaptive information fusion in multimodal systems
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Combining user modeling and machine learning to predict users' multimodal integration patterns
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Hi-index | 0.00 |
It has often been shown that using multiple modalities to authenticate the identity of a person is more robust than using only one. Various combination techniques exist and are often performed at the level of the output scores of each modality system. In this paper, we present a novel HMM architecture able to model the joint probability distribution of pairs of asynchronous sequences (such as speech and video streams) describing the same event. We show how this model can be used for audio-visual person authentication. Results on the M2VTS database show robust performances of the system under various audio noise conditions, when compared to other state-of-the-art techniques.