Probabalistic Models and Informative Subspaces for Audiovisual Correspondence

Authors:
John W. Fisher, III;Trevor Darrell
Affiliations:
-;-
Venue:
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Year:
2002

Citing 3
Cited 3

Elements of information theory

Elements of information theory
An information-theoretic unsupervised learning algorithm for neural networks

An information-theoretic unsupervised learning algorithm for neural networks
An Information-Theoretic Approach to Neural Computing

An Information-Theoretic Approach to Neural Computing

Audiovisual Arrays for Untethered Spoken Interfaces

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
EM detection of common origin of multi-modal cues

Proceedings of the 8th international conference on Multimodal interfaces
On-line multi-modal speaker diarization

Proceedings of the 9th international conference on Multimodal interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a probabalistic model of single source multimodal generation and show how algorithms for maximizing mutual information can find the correspondences between components of each signal. We show how non-parametric techniques for finding informative subspaces can capture the complex statistical relationship between signals in different modalities. We extend a previous technique for finding informative subspaces to include new priors on the projection weights, yielding more robust results. Applied to human speakers, our model can find the relationship between audio speech and video of facial motion, and partially segment out background events in both channels. We present new results on the problem of audio-visual verification, and show how the audio and video of a speaker can be matched even when no prior model of the speaker's voice or appearance is available.