Robust audio-visual speaker identification using a modified score-based reliability in modality integration

  • Authors:
  • Md. Tariquzzaman;Jin Young Kim;Seung You Na

  • Affiliations:
  • Chonnam National University, Gwangju, Republic of Korea;Chonnam National University, Gwangju, Republic of Korea;Chonnam National University, Gwangju, Republic of Korea

  • Venue:
  • Proceedings of the International Conference on Management of Emergent Digital EcoSystems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identity recognition in real environment with reliable mode is a key issue in human computer interaction (HCI). In this paper, we present a robust speaker identification system considering score based optimal reliability measure of different modalities. We propose an extension of the modified convection function's optimizing parameter to account optimal reliability simultaneously via audio and lip information based reliability measure in bimodal speaker identification system for robust speaker identification. For degradation of visual signals, we have applied JPEG compression to test images. In addition, for creating mismatch in between enrollment and test session, acoustic Babble noises and artificial illumination have been added to test audio and visual signals, respectively. Local PCA has been used to both modalities for reducing the dimension of feature vector. We have applied a swarm intelligence algorithm i.e., particle swarm optimization for optimizing the modified convection function's optimizing parameters. The overall speaker identification experiments are performed using VidTimit DB. Experimental results show that our proposed optimal reliability measures have effectively enhanced the identification accuracy of 7.73% in comparison with the best classifier system in the integration system and maintains the modality reliability statistics in term of its performance thus verified the consistency of the proposed extension.