Text driven face-video synthesis using GMM and spatial correlation

  • Authors:
  • Dereje Teferi;Maycel I. Faraj;Josef Bigun

  • Affiliations:
  • School of Information Science, Computer, and Electrical Engineering (IDE), Halmstad University, Halmstad, Sweden;School of Information Science, Computer, and Electrical Engineering (IDE), Halmstad University, Halmstad, Sweden;School of Information Science, Computer, and Electrical Engineering (IDE), Halmstad University, Halmstad, Sweden

  • Venue:
  • SCIA'07 Proceedings of the 15th Scandinavian conference on Image analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Liveness detection is increasingly planned to be incorporated into biometric systems to reduce the risk of spoofing and impersonation. Some of the techniques used include detection of motion of the head while posing/speaking, iris size in varying illumination, fingerprint sweat, text-prompted speech, speech-to-lip motion synchronization etc. In this paper, we propose to build a biometric signal to test attack resilience of biometric systems by creating a text-driven video synthesis of faces. We synthesize new realistic looking video sequences from real image sequences representing utterance of digits. We determine the image sequences for each digit by using a GMM based speech recognizer. Then, depending on system prompt (sequence of digits) our method regenerates a video signal to test attack resilience of a biometric system that asks for random digit utterances to prevent play-back of pre-recorded data representing both audio and images. The discontinuities in the new image sequence, created at the connection of each digit, are removed by using a frame prediction algorithm that makes use of the well known block matching algorithm. Other uses of our results include web-based video communication for electronic commerce and frame interpolation for low frame rate video.