Optimal weighting of bimodal biometric information with specific application to audio-visual person identification

Authors:
Roland Hu;R. I. Damper
Affiliations:
Information: Signals, Images, Systems (ISIS) Research Group, School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK;Information: Signals, Images, Systems (ISIS) Research Group, School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK
Venue:
Information Fusion
Year:
2009

Citing 15
Cited 0

On the Probabilistic Interpretation of Neural Network Classifiers and Discriminative Training Criteria

IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal combinations of pattern classifiers

Pattern Recognition Letters
Face Recognition by Elastic Bunch Graph Matching

IEEE Transactions on Pattern Analysis and Machine Intelligence
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Distortion Invariant Object Recognition in the Dynamic Link Architecture

IEEE Transactions on Computers
Person Identification Using Multiple Cues

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sum Versus Vote Fusion in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
A 'No Panacea Theorem' for Multiple Classifier Combination

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Robust speaker verification via fusion of speech and lip modalities

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Score normalization in multimodal biometric systems

Pattern Recognition
Multimodal authentication using asynchronous HMMs

AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Integration strategies for audio-visual speech processing: applied to text-dependent speaker recognition

IEEE Transactions on Multimedia
Multimodal speaker identification using an adaptive classifier cascade based on modality reliability

IEEE Transactions on Multimedia
Fusion of face and speech data for person identity verification

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new method is proposed to estimate the optimal weighting parameter for combining audio (speech) and visual (face) information in person identification, based on estimating probability density functions (pdfs) for classifier scores under Gaussian assumptions. Performance comparisons with real and simulated data indicate that this method has advantages in reducing bias and variance of the estimation relative to other methods tried, so achieving a robust estimator of the optimal weighting parameter. Another contribution is that we propose the bootstrap method to compare performances of different algorithms for estimating the optimal weighting parameter, so providing a strict criterion in comparing algorithms of this kind. Using simulated data, for which the pdf is controlled and known, we show that the advantages of the method hold up when the underlying Gaussian assumption is violated. The main drawback is that we have to choose an adjustable parameter, and it is not clear how this should best be done.