Multiple features in temporal models for the representation of visual contents in video

Authors:
Juan M. Sánchez;Xavier Binefa;John R. Kender
Affiliations:
Dept. d'Informàtica, U. Autònoma de Barcelona, Bellaterra, Barcelona, Spain;Dept. d'Informàtica, U. Autònoma de Barcelona, Bellaterra, Barcelona, Spain;Dept. of Computer Science, Columbia University, New York
Venue:
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Year:
2003

Citing 3
Cited 1

Introduction to Bayesian Networks

Introduction to Bayesian Networks
Introduction to MPEG-7: Multimedia Content Description Interface

Introduction to MPEG-7: Multimedia Content Description Interface
Extracting semantics from audio-visual content: the final frontier in multimedia retrieval

IEEE Transactions on Neural Networks

The state of the art in image and video retrieval

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper analyzes different ways of coupling the information from multiple visual features in the representation of visual contents using temporal models based on Markov chains. We assume that the optimal combination is given by the Cartesian product of all feature state spaces. Simpler model structures are obtained by assuming independencies between random variables in the probabilistic structure. The relative entropy provides a measure of the information loss of a simplified structure with respect to a more complex one. The loss of information is then compared to the loss of accuracy in the representation of visual contents in video sequences, which is measured in terms of shot retrieval performance. We reach three main conclusions: (1) the full-coupled model structure is an accurate approximation to the Cartesian product structure, (2) the largest loss of information is found when direct temporal dependencies are removed, and (3) there is a direct relationship between loss of information and loss of representation accuracy.