Speech analysis and synthesis methods developed at ECL in NTT-From LPC to LSP-
Speech Communication - Special issue: Speech research in Japan
Quantitative association of vocal-tract and facial behavior
Speech Communication - Special issue on auditory-visual speech processing
Assessing face and speech consistency for monologue detection in video
Proceedings of the tenth ACM international conference on Multimedia
Audio-visual synchrony for detection of monologues in video archives
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
EURASIP Journal on Applied Signal Processing
The BANCA database and evaluation protocol
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Audio-visual speaker identification based on the use of dynamic audio and visual features
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
A review of speech-based bimodal recognition
IEEE Transactions on Multimedia
Speaker association with signal-level audiovisual fusion
IEEE Transactions on Multimedia
Biometric person authentication with liveness detection based on audio-visual fusion
International Journal of Biometrics
Talking-face identity verification, audiovisual forgery, and robustness issues
EURASIP Journal on Advances in Signal Processing - Special issue on recent advances in biometric systems: a signal processing perspective
Identities, forgeries and disguises
International Journal of Information Technology and Management
Hi-index | 0.00 |
Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.