A decision fusion system across time and classifiers for audio-visual person identification

  • Authors:
  • Andreas Stergiou;Aristodemos Pnevmatikakis;Lazaros Polymenakos

  • Affiliations:
  • Athens Information Technology, Autonomic and Grid Computing, Peania, Greece;Athens Information Technology, Autonomic and Grid Computing, Peania, Greece;Athens Information Technology, Autonomic and Grid Computing, Peania, Greece

  • Venue:
  • CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper the person identification system developed at Athens Information Technology is presented. It comprises of an audio-only (speech), a video-only (face) and an audiovisual fusion subsystem. Audio recognition is based on the Gaussian Mixture modeling of the principal components of the Mel-Frequency Cepstral Coefficients of speech. Video recognition is based on linear subspace projection methods and temporal fusion using weighted voting on the results. Audiovisual fusion is done by fusing the unimodal identities into the multimodal one, using a suitable confidence metric for the results of the unimodal classifiers.