Multidimensional Orientation Estimation with Applications to Texture Analysis and Optical Flow
IEEE Transactions on Pattern Analysis and Machine Intelligence
Lip movement synthesis from speech based on hidden Markov models
Speech Communication - Special issue on auditory-visual speech processing
Person Identification Using Multiple Cues
IEEE Transactions on Pattern Analysis and Machine Intelligence
Retinal vision applied to facial features detection and face authentication
Pattern Recognition Letters - In memory of Professor E.S. Gelsema
Acoustic-labial Speaker Verification
AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Statistical Chromaticity Models for Lip Tracking with B-splines
AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Expert Conciliation for Multi Modal Person Authentication Systems by Bayesian Statistics
AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Recent Advances in Speaker Recognition (Invited Paper)
AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Fusion of Audio-Visual Information for Integrated Speech Processing
AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
Face Authentication with Sparse Grid Gabor Information
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
Multi-Modal Speech Recognition Using Optical-Flow Analysis for Lip Images
Journal of VLSI Signal Processing Systems
Evaluating Liveness by Face Images and the Structure Tensor
AUTOID '05 Proceedings of the Fourth IEEE Workshop on Automatic Identification Advanced Technologies
Person Verification by Lip-Motion
CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
An iterative image registration technique with an application to stereo vision
IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Video based face recognition using multiple classifiers
FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Visual model structures and synchrony constraints for audio-visual speech recognition
IEEE Transactions on Audio, Speech, and Language Processing
Audio-visual speech modeling for continuous speech recognition
IEEE Transactions on Multimedia
A review of speech-based bimodal recognition
IEEE Transactions on Multimedia
IEEE Transactions on Multimedia
On cluster validity for the fuzzy c-means model
IEEE Transactions on Fuzzy Systems
Dynamic visual features for audio-visual speaker verification
Computer Speech and Language
Lip biometrics for digit recognition
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Assessing the uniqueness and permanence of facial actions for use in biometric applications
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans - Special issue on recent advances in biometrics
Spinlock: a single-cue haptic and audio PIN input technique for authentication
HAID'11 Proceedings of the 6th international conference on Haptic and audio interaction design
Speaker and digit recognition by audio-visual lip biometrics
ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics
Hi-index | 0.10 |
This paper describes a new identity authentication technique by a synergetic use of lip-motion and speech. The lip-motion is defined as the distribution of apparent velocities in the movement of brightness patterns in an image and is estimated by computing the velocity components of the structure tensor by 1D processing, in 2D manifolds. Since the velocities are computed without extracting the speaker's lip-contours, more robust visual features can be obtained in comparison to motion features extracted from lip-contours. The motion estimations are performed in a rectangular lip-region, which affords increased computational efficiency. A person authentication implementation based on lip-movements and speech is presented along with experiments exhibiting a recognition rate of 98%. Besides its value in authentication, the technique can be used naturally to evaluate the ''liveness'' of someone speaking as it can be used in text-prompted dialogue. The XM2VTS database was used for performance quantification as it is currently the largest publicly available database (~300 persons) containing both lip-motion and speech. Comparisons with other techniques are presented.