An integrated statistical model for multimedia evidence combination

Authors:
Sheng Gao;Joo-Hwee Lim;Qibin Sun
Affiliations:
Institute for Infocomm Research, Singapore, Singapore;Institute for Infocomm Research, Singapore, Singapore;Institute for Infocomm Research, Singapore, Singapore
Venue:
Proceedings of the 15th international conference on Multimedia
Year:
2007

Citing 11
Cited 1

A maximum entropy approach to natural language processing

Computational Linguistics
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Discriminative model fusion for semantic concept detection and annotation in video

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
The combination limit in multimedia retrieval

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Optimal multimodal fusion for multimedia data analysis

Proceedings of the 12th annual ACM international conference on Multimedia
Multimedia semantic indexing using model vectors

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
A comparison of score, rank and probability-based fusion methods for video shot retrieval

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval

A novel video summarization based on mining the story-structure and semantic relations among concept entities

IEEE Transactions on Multimedia - Special issue on integration of context and content

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given the rich content-based features of multimedia (e.g., visual, text, or audio) and the development of various approaches to automatic detectors (e.g., SVM, Adaboost, HMM or GMM, etc), can we find an efficient approach to combine these evidences? In the paper, we address this issue by proposing an Integrated Statistical Model (ISM) to combine diverse evidences extracted from the domain knowledge of detectors, the intrinsic structure of modality distribution and inter-concept associations. The ISM provides a unified framework for evidence fusion, having the following unique advantages: 1) the intrinsic modes in the modality distribution are discovered and modeled by a generative model; 2) each mode is a partial description of structure of the modality and the mode configuration, i.e. a set of modes, and is a new representation of the document content; 3) mode discrimination is automatically learned; 4) prior knowledge such as detector correlations and inter-concept relations can be explicitly described and integrated. More importantly, an efficient pseudo-EM algorithm is realized for training the statistical model. The learning algorithm relaxes the computational cost due to the normalized factor and latent variables in the graphical model. We evaluate system performance of our multimedia semantic concept detection with the TRECVID 2005 development dataset, in terms of efficiency and capacity. Our experimental results demonstrate that the ISM fusion outperforms the SVM based discriminative fusion method.