Multimodal biometric human recognition for perceptual human-computer interaction

Authors:
Richard M. Jiang;Abdul H. Sadka;Danny Crookes
Affiliations:
School of Computer Science, Loughborough University, Loughborough, UK;Department of Electronic & Computer Engineering, Brunel University, West London, UK;The Institute of Electronics, Communications and Information Technology, School of Electronics, Electrical Engineering & Computer Science, Queen's University Belfast, Belfast, UK
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Year:
2010

Citing 14
Cited 1

Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection

IEEE Transactions on Pattern Analysis and Machine Intelligence
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
PCA versus LDA

IEEE Transactions on Pattern Analysis and Machine Intelligence
Person Identification Using Multiple Cues

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiple View Geometry in Computer Vision

Multiple View Geometry in Computer Vision
Face Recognition Using Laplacianfaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
Audio-visual talking face detection

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition

International Journal of Computer Vision
Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings

IEEE Transactions on Audio, Speech, and Language Processing
Interrelation Between Speech and Facial Gestures in Emotional Utterances: A Single Subject Study

IEEE Transactions on Audio, Speech, and Language Processing
Audio-Visual Affect Recognition

IEEE Transactions on Multimedia
Robust Biometric Person Identification Using Automatic Classifier Fusion of Speech, Mouth, and Face Experts

IEEE Transactions on Multimedia
Audio–Visual Affective Expression Recognition Through Multistream Fused HMM

IEEE Transactions on Multimedia
Multimodal decision-level fusion for person authentication

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Human emotion recognition from videos using spatio-temporal and audio features

The Visual Computer: International Journal of Computer Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a novel video-based multimodal biometric verification scheme using the subspace-based low-level feature fusion of face and speech is developed for specific speaker recognition for perceptual human-computer interaction (HCI). In the proposed scheme, human face is tracked and face pose is estimated to weight the detected facelike regions in successive frames, where ill-posed faces and false-positive detections are assigned with lower credit to enhance the accuracy. In the audio modality, mel-frequency cepstral coefficients are extracted for voice-based biometric verification. In the fusion step, features from both modalities are projected into nonlinear Laplacian Eigenmap subspace formultimodal speaker recognition and combined at low level. The proposed approach is tested on the video database of ten human subjects, and the results show that the proposed scheme can attain better accuracy in comparison with the conventional multimodal fusion using latent semantic analysis as well as the single-modality verifications. The experiment on MATLAB shows the potential of the proposed scheme to attain the real-time performance for perceptual HCI applications.