A decision fusion system across time and classifiers for audio-visual person identification

Authors:
Andreas Stergiou;Aristodemos Pnevmatikakis;Lazaros Polymenakos
Affiliations:
Athens Information Technology, Autonomic and Grid Computing, Peania, Greece;Athens Information Technology, Autonomic and Grid Computing, Peania, Greece;Athens Information Technology, Autonomic and Grid Computing, Peania, Greece
Venue:
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Year:
2006

Citing 7
Cited 4

Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection

IEEE Transactions on Pattern Analysis and Machine Intelligence
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Person Identification Using Multiple Cues

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Face Detection Using the Hausdorff Distance

AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
Overview of the Face Recognition Grand Challenge

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Video-Based Face Recognition Evaluation in the CHIL Project - Run 1

FGR '06 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition
Eigenfaces for recognition

Journal of Cognitive Neuroscience

The AIT Multimodal Person Identification System for CLEAR 2007

Multimodal Technologies for Perception of Humans
Robust multimodal audio---visual processing for advanced context awareness in smart spaces

Personal and Ubiquitous Computing
Where and Who? Person Tracking and Recognition System

Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies
The 2006 athens information technology speech activity detection and speaker diarization systems

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper the person identification system developed at Athens Information Technology is presented. It comprises of an audio-only (speech), a video-only (face) and an audiovisual fusion subsystem. Audio recognition is based on the Gaussian Mixture modeling of the principal components of the Mel-Frequency Cepstral Coefficients of speech. Video recognition is based on linear subspace projection methods and temporal fusion using weighted voting on the results. Audiovisual fusion is done by fusing the unimodal identities into the multimodal one, using a suitable confidence metric for the results of the unimodal classifiers.