Multifactor fusion for audio-visual speaker recognition

  • Authors:
  • Girija Chetty;Dat Tran

  • Affiliations:
  • School of Information Sciences and Engineering, University of Canberra, Australia;School of Information Sciences and Engineering, University of Canberra, Australia

  • Venue:
  • SSIP'07 Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a multifactor hybrid fusion approach for enhancing security in audio-visual speaker verification. Speaker verification experiments conducted on two audiovisual databases, VidTIMIT and UCBN, show that multifactor hybrid fusion involve a combination feature-level fusion of lip-voice features and face-lip-voice features at score-level is indeed a powerful technique for speaker identity verification, as it preserves synchronisation. of the closely coupled modalities, such as face, voice and lip dynamics of a speaker during speech, through various stages of authentication. An improvement in error rate of the order of 22-36% is achieved for experiments by using feature level fusion of acoustic and visual feature vectors from lip region as compared to classical late fusion approach.