Floating search methods in feature selection
Pattern Recognition Letters
The nature of statistical learning theory
The nature of statistical learning theory
WordNet: a lexical database for English
Communications of the ACM
IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal multimodal fusion for multimedia data analysis
Proceedings of the 12th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis
Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis
Proceedings of the 13th annual ACM international conference on Multimedia
Content-based image retrieval: approaches and trends of the new age
Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
The Design of High-Level Features for Photo Quality Assessment
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Evaluation campaigns and TRECVid
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
International Journal of Computer Vision
A probabilistic multimedia retrieval model and its evaluation
EURASIP Journal on Applied Signal Processing
Crossing textual and visual content in different application scenarios
Multimedia Tools and Applications
Line segment based edge feature using Hough transform
VIIP '07 The Seventh IASTED International Conference on Visualization, Imaging and Image Processing
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision
Evaluating Color Descriptors for Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multi-scale Color Local Binary Patterns for Visual Object Classes Recognition
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
We propose in this paper a novel multimodal approach to automatically predict the visual concepts of images through an effective fusion of visual and textual features. It relies on a Selective Weighted Late Fusion (SWLF) scheme which, in optimizing an overall Mean interpolated Average Precision (MiAP), learns to automatically select and weight the best experts for each visual concept to be recognized. Experiments were conducted on the MIR Flickr image collection within the ImageCLEF 2011 Photo Annotation challenge. The results have brought to the fore the effectiveness of SWLF as it achieved a MiAP of 43.69 % for the detection of the 99 visual concepts which ranked 2nd out of the 79 submitted runs, while our new variant of SWLF allows to reach a MiAP of 43.93 %.