Bimodal speaker identification using dynamic bayesian network

Authors:
Dongdong Li;LiFeng Sang;Yingchun Yang;Zhaohui Wu
Affiliations:
Department of Computer Science, Zhejiang University, Hang Zhou, P.R China;Department of Computer Science, Zhejiang University, Hang Zhou, P.R China;Department of Computer Science, Zhejiang University, Hang Zhou, P.R China;Department of Computer Science, Zhejiang University, Hang Zhou, P.R China
Venue:
SINOBIOMETRICS'04 Proceedings of the 5th Chinese conference on Advances in Biometric Person Authentication
Year:
2004

Citing 8
Cited 1

Introduction in inference in Bayesian networks

Learning in graphical models
An Experimental Comparison of Classifier Fusion Rules for Multimodal Personal Identity Verification Systems

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
An Experimental Comparison of Fixed and Trained Fusion Rules for Crisp Classifier Outputs

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Analysis of Linear and Order Statistics Combiners for Fusion of Imbalanced Classifiers

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Dynamic bayesian networks: representation, inference and learning

Dynamic bayesian networks: representation, inference and learning
Compensated mel frequency cepstrum coefficients

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Combining face and iris biometrics for identity verification

AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Fusion of face and speech data for person identity verification

IEEE Transactions on Neural Networks

Dynamic bayesian networks for audio-visual speaker recognition

ICB'06 Proceedings of the 2006 international conference on Advances in Biometrics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The authentication of a person requires a consistently high recognition accuracy which is difficult to attain using a single recognition modality This paper assesses the fusion of voiceprint and face feature for bimodal speaker identification using Dynamic Bayesian Network (DBN) Our contribution is to propose a general feature-level fusion framework in bimodal speaker identification Within the framework, the voice and face feature are combined into a single DBN to obtain better performance than any single system alone The tests were conducted on a multi-modal database of 54 users who provided voiceprint and face data of different speech type and content .We compare our approach with mono-modal system and other classic decision-level methods and show that feature-level fusion using dynamic Bayesian network improved performance by about 4-5%, much better than the others.