A Bayesian network modeling approach for cross media analysis

Authors:
Christina Lakka;Spiros Nikolopoulos;Christos Varytimidis;Ioannis Kompatsiaris
Affiliations:
Informatics and Telematics Institute, CERTH, 6th km Charilaou-Thermi Road, Thermi-Thessaloniki, Greece;Informatics and Telematics Institute, CERTH, 6th km Charilaou-Thermi Road, Thermi-Thessaloniki, Greece and School of Electronic Engineering and Computer Science, Queen Mary University of London, E ...;School of Electrical and Computer Engineering, National Technical University of Athens, Greece;Informatics and Telematics Institute, CERTH, 6th km Charilaou-Thermi Road, Thermi-Thessaloniki, Greece
Venue:
Image Communication
Year:
2011

Citing 35
Cited 0

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Local computations with probabilities on graphical structures and their application to expert systems

Readings in uncertain reasoning
A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Making large-scale support vector machine learning practical

Advances in kernel methods
A brief survey of web data extraction tools

ACM SIGMOD Record
Structural extraction from visual layout of documents

Proceedings of the eleventh international conference on Information and knowledge management
Building Probabilistic Networks: 'Where Do the Numbers Come From?' Guest Editors' Introduction

IEEE Transactions on Knowledge and Data Engineering
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Extracting structured data from Web pages

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A General Framework for Object Detection

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Multimedia content processing through cross-modal association

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Optimal multimodal fusion for multimedia data analysis

Proceedings of the 12th annual ACM international conference on Multimedia
Multimodal Video Indexing: A Review of the State-of-the-art

Multimedia Tools and Applications
A causal mapping approach to constructing Bayesian networks

Decision Support Systems
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
The Semantic Web Vision: Where Are We?

IEEE Intelligent Systems
Estimating average precision when judgments are incomplete

Knowledge and Information Systems
Watch, Listen & Learn: Co-training on Captioned Images and Videos

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Structure Inference for Bayesian Multisensory Scene Understanding

IEEE Transactions on Pattern Analysis and Machine Intelligence
Integrating prior domain knowledge into discriminative learning using automatic model construction and phantom examples

Pattern Recognition
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Extended gloss overlaps as a measure of semantic relatedness

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Multimodal Fusion for Video Search Reranking

IEEE Transactions on Knowledge and Data Engineering
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A fused hidden Markov model with application to bimodal speech processing

IEEE Transactions on Signal Processing
A probabilistic framework for semantic video indexing, filtering,and retrieval

IEEE Transactions on Multimedia
ClassView: hierarchical video shot classification, indexing, and accessing

IEEE Transactions on Multimedia
A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval

IEEE Transactions on Multimedia
Audio–Visual Affective Expression Recognition Through Multistream Fused HMM

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing methods for the semantic analysis of multimedia, although effective for single-medium scenarios, are inherently flawed in cases where knowledge is spread over different media types. In this work we implement a cross media analysis scheme that takes advantage of both visual and textual information for detecting high-level concepts. The novel aspect of this scheme is the definition and use of a conceptual space where information originating from heterogeneous media types can be meaningfully combined and facilitate analysis decisions. More specifically, our contribution is on proposing a modeling approach for Bayesian Networks that defines this conceptual space and allows evidence originating from the domain knowledge, the application context and different content modalities to support or disproof a certain hypothesis. Using this scheme we have performed experiments on a set of 162 compound documents taken from the domain of car manufacturing industry and 118581 video shots taken from the TRECVID2010 competition. The obtained results have shown that the proposed modeling approach exploits the complementary effect of evidence extracted across different media and delivers performance improvements compared to the single-medium cases. Moreover, by comparing the performance of the proposed approach with an approach using Support Vector Machines (SVM), we have verified that in a cross media setting the use of generative rather than discriminative models are more suited, mainly due to their ability to smoothly incorporate explicit knowledge and learn from a few examples.