Audio-visual person authentication using lip-motion from orientation maps

Authors:
Maycel-Isaac Faraj;Josef Bigun
Affiliations:
School of Information Science, Computer and Electrical Engineering (IDE), Halmstad University, Box 823, SE-301 18 Halmstad, Sweden;School of Information Science, Computer and Electrical Engineering (IDE), Halmstad University, Box 823, SE-301 18 Halmstad, Sweden
Venue:
Pattern Recognition Letters
Year:
2007

Citing 21
Cited 5

Multidimensional Orientation Estimation with Applications to Texture Analysis and Optical Flow

IEEE Transactions on Pattern Analysis and Machine Intelligence
Lip movement synthesis from speech based on hidden Markov models

Speech Communication - Special issue on auditory-visual speech processing
BioID: A Multimodal Biometric Identification System

Computer
Person Identification Using Multiple Cues

IEEE Transactions on Pattern Analysis and Machine Intelligence
Retinal vision applied to facial features detection and face authentication

Pattern Recognition Letters - In memory of Professor E.S. Gelsema
Acoustic-labial Speaker Verification

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Statistical Chromaticity Models for Lip Tracking with B-splines

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Expert Conciliation for Multi Modal Person Authentication Systems by Bayesian Statistics

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Recent Advances in Speaker Recognition (Invited Paper)

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Fusion of Audio-Visual Information for Integrated Speech Processing

AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
Face Authentication with Sparse Grid Gabor Information

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
Multi-Modal Speech Recognition Using Optical-Flow Analysis for Lip Images

Journal of VLSI Signal Processing Systems
Evaluating Liveness by Face Images and the Structure Tensor

AUTOID '05 Proceedings of the Fourth IEEE Workshop on Automatic Identification Advanced Technologies
Person Verification by Lip-Motion

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
An iterative image registration technique with an application to stereo vision

IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Video based face recognition using multiple classifiers

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Visual model structures and synchrony constraints for audio-visual speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
Audio-visual speech modeling for continuous speech recognition

IEEE Transactions on Multimedia
A review of speech-based bimodal recognition

IEEE Transactions on Multimedia
Integration strategies for audio-visual speech processing: applied to text-dependent speaker recognition

IEEE Transactions on Multimedia
On cluster validity for the fuzzy c-means model

IEEE Transactions on Fuzzy Systems

Dynamic visual features for audio-visual speaker verification

Computer Speech and Language
Lip biometrics for digit recognition

CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Assessing the uniqueness and permanence of facial actions for use in biometric applications

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans - Special issue on recent advances in biometrics
Spinlock: a single-cue haptic and audio PIN input technique for authentication

HAID'11 Proceedings of the 6th international conference on Haptic and audio interaction design
Speaker and digit recognition by audio-visual lip biometrics

ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics

Quantified Score

Hi-index	0.10

Visualization

Abstract

This paper describes a new identity authentication technique by a synergetic use of lip-motion and speech. The lip-motion is defined as the distribution of apparent velocities in the movement of brightness patterns in an image and is estimated by computing the velocity components of the structure tensor by 1D processing, in 2D manifolds. Since the velocities are computed without extracting the speaker's lip-contours, more robust visual features can be obtained in comparison to motion features extracted from lip-contours. The motion estimations are performed in a rectangular lip-region, which affords increased computational efficiency. A person authentication implementation based on lip-movements and speech is presented along with experiments exhibiting a recognition rate of 98%. Besides its value in authentication, the technique can be used naturally to evaluate the ''liveness'' of someone speaking as it can be used in text-prompted dialogue. The XM2VTS database was used for performance quantification as it is currently the largest publicly available database (~300 persons) containing both lip-motion and speech. Comparisons with other techniques are presented.