Probabalistic Models and Informative Subspaces for Audiovisual Correspondence

  • Authors:
  • John W. Fisher, III;Trevor Darrell

  • Affiliations:
  • -;-

  • Venue:
  • ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a probabalistic model of single source multimodal generation and show how algorithms for maximizing mutual information can find the correspondences between components of each signal. We show how non-parametric techniques for finding informative subspaces can capture the complex statistical relationship between signals in different modalities. We extend a previous technique for finding informative subspaces to include new priors on the projection weights, yielding more robust results. Applied to human speakers, our model can find the relationship between audio speech and video of facial motion, and partially segment out background events in both channels. We present new results on the problem of audio-visual verification, and show how the audio and video of a speaker can be matched even when no prior model of the speaker's voice or appearance is available.