Towards robust person recognition on handheld devices using face and speaker identification technologies

Authors:
Timothy J. Hazen;Eugene Weinstein;Alex Park
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Venue:
Proceedings of the 5th international conference on Multimodal interfaces
Year:
2003

Citing 3
Cited 3

The nature of statistical learning theory

The nature of statistical learning theory
Combining Evidence in Multimodal Personal Identity Recognition Systems

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Hybrid Biometric Person Authentication Using Face and Voice Features

AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication

The usage of independent component analysis for robust speaker verification

AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Feature based RDWT watermarking for multimodal biometric system

Image and Vision Computing
Adaptive persuasive messaging to increase service retention: using persuasion profiles to increase the effectiveness of email reminders

Personal and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most face and speaker identification techniques are tested on data collected in controlled environments using high quality cameras and microphones. However, the use of these technologies in variable environments and with the help of the inexpensive sound and image capture hardware present in mobile devices presents an additional challenge. In this study, we investigate the application of existing face and speaker identification techniques to a person identification task on a handheld device. These techniques have proven to perform accurately on tightly constrained experiments where the lighting conditions, visual backgrounds, and audio environments are fixed and specifically adjusted for optimal data quality. When these techniques are applied on mobile devices where the visual and audio conditions are highly variable, degradations in performance can be expected. Under these circumstances, the combination of multiple biometric modalities can improve the robustness and accuracy of the person identification task. In this paper, we present our approach for combining face and speaker identification technologies and experimentally demonstrate a fused multi-biometric system which achieves a 50% reduction in equal error rate over the better of the two independent systems.