Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Readings in uncertain reasoning
Making large-scale support vector machine learning practical
Advances in kernel methods
A brief survey of web data extraction tools
ACM SIGMOD Record
Structural extraction from visual layout of documents
Proceedings of the eleventh international conference on Information and knowledge management
Building Probabilistic Networks: 'Where Do the Numbers Come From?' Guest Editors' Introduction
IEEE Transactions on Knowledge and Data Engineering
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A General Framework for Object Detection
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Multimedia content processing through cross-modal association
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Optimal multimodal fusion for multimedia data analysis
Proceedings of the 12th annual ACM international conference on Multimedia
Multimodal Video Indexing: A Review of the State-of-the-art
Multimedia Tools and Applications
A causal mapping approach to constructing Bayesian networks
Decision Support Systems
Early versus late fusion in semantic video analysis
Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis
Proceedings of the 13th annual ACM international conference on Multimedia
The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing
IEEE Transactions on Pattern Analysis and Machine Intelligence
International Journal of Computer Vision
The Semantic Web Vision: Where Are We?
IEEE Intelligent Systems
Estimating average precision when judgments are incomplete
Knowledge and Information Systems
Watch, Listen & Learn: Co-training on Captioned Images and Videos
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Structure Inference for Bayesian Multisensory Scene Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Extended gloss overlaps as a measure of semantic relatedness
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Multimodal Fusion for Video Search Reranking
IEEE Transactions on Knowledge and Data Engineering
Evaluating Color Descriptors for Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
A fused hidden Markov model with application to bimodal speech processing
IEEE Transactions on Signal Processing
A probabilistic framework for semantic video indexing, filtering,and retrieval
IEEE Transactions on Multimedia
ClassView: hierarchical video shot classification, indexing, and accessing
IEEE Transactions on Multimedia
A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval
IEEE Transactions on Multimedia
Audio–Visual Affective Expression Recognition Through Multistream Fused HMM
IEEE Transactions on Multimedia
Hi-index | 0.00 |
Existing methods for the semantic analysis of multimedia, although effective for single-medium scenarios, are inherently flawed in cases where knowledge is spread over different media types. In this work we implement a cross media analysis scheme that takes advantage of both visual and textual information for detecting high-level concepts. The novel aspect of this scheme is the definition and use of a conceptual space where information originating from heterogeneous media types can be meaningfully combined and facilitate analysis decisions. More specifically, our contribution is on proposing a modeling approach for Bayesian Networks that defines this conceptual space and allows evidence originating from the domain knowledge, the application context and different content modalities to support or disproof a certain hypothesis. Using this scheme we have performed experiments on a set of 162 compound documents taken from the domain of car manufacturing industry and 118581 video shots taken from the TRECVID2010 competition. The obtained results have shown that the proposed modeling approach exploits the complementary effect of evidence extracted across different media and delivers performance improvements compared to the single-medium cases. Moreover, by comparing the performance of the proposed approach with an approach using Support Vector Machines (SVM), we have verified that in a cross media setting the use of generative rather than discriminative models are more suited, mainly due to their ability to smoothly incorporate explicit knowledge and learn from a few examples.