The AIT Multimodal Person Identification System for CLEAR 2007

  • Authors:
  • Andreas Stergiou;Aristodemos Pnevmatikakis;Lazaros Polymenakos

  • Affiliations:
  • Autonomic and Grid Computing, Athens Information Technology, Peania, Greece 19002;Autonomic and Grid Computing, Athens Information Technology, Peania, Greece 19002;Autonomic and Grid Computing, Athens Information Technology, Peania, Greece 19002

  • Venue:
  • Multimodal Technologies for Perception of Humans
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the person identification system developed at Athens Information Technology and its performance in the CLEAR 2007 evaluations. The system operates on the audiovisual information (speech and faces) collected over the duration of gallery and probe videos. It comprises of an audio-only (speech), a video-only (face) and an audiovisual fusion subsystem. Audio recognition is based on the Gaussian Mixture modeling of the principal components of composite feature vectors, consisting of Mel-Frequency Cepstral Coefficients and Perceptual Linear Prediction coefficients of speech. Video recognition is based on combining three different classification algorithms: Principal Components Analysis with a modified Mahalanobis distance, sub-class Linear Discriminant Analysis (featuring automatic sub-class generation) with cosine distance and Bayesian classifier based on Gaussian modeling of intrapersonal differences. A nearest neighbor classification rule is applied. A decision fusion scheme across time and classifiers returns the video identity. The audiovisual subsystem fuses the unimodal identities into the multimodal one, using a suitable confidence metric.