Multimodal speaker identification using an adaptive classifier cascade based on modality reliability

Authors:
Engin Erzin;Y. Yemez;A. M. Tekalp
Affiliations:
Multimedia, Koc Univ., Istanbul, Turkey;-;-
Venue:
IEEE Transactions on Multimedia
Year:
2005

Citing 0
Cited 8

Multimodal Person Recognition for Human-Vehicle Interaction

IEEE MultiMedia
Multimodal speaker/speech recognition using lip motion, lip texture and audio

Signal Processing - Special section: Multimodal human-computer interfaces
Empirical evaluation of combining unobtrusiveness and security requirements in multimodal biometric systems

Image and Vision Computing
Optimal weighting of bimodal biometric information with specific application to audio-visual person identification

Information Fusion
Formant position based weighted spectral features for emotion recognition

Speech Communication
RANSAC-based training data selection on spectral features for emotion recognition from spontaneous speech

COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Advanced authoring tools for game-based training

SCSC '09 Proceedings of the 2009 Summer Computer Simulation Conference
Individuality in communicative bodily behaviours

COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, we propose a new adaptive cascade rule that favors reliable modality combinations through a cascade of classifiers. The order of the classifiers in the cascade is adaptively determined based on the reliability of each modality combination. A novel reliability measure, that genuinely fits to the open-set speaker identification problem, is also proposed to assess accept or reject decisions of a classifier. A formal framework is developed based on probability of correct decision for analytical comparison of the proposed adaptive rule with other classifier combination rules. The proposed adaptive rule is more robust in the presence of unreliable modalities, and outperforms the hard-level max rule and soft-level weighted summation rule, provided that the employed reliability measure is effective in assessment of classifier decisions. Experimental results that support this assertion are provided.