Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Object Categorization by Learned Universal Visual Dictionary
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Evaluation campaigns and TRECVid
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Robust Object Detection with Interleaved Categorization and Segmentation
International Journal of Computer Vision
Kernel Codebooks for Scene Categorization
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Marginal-based visual alphabets for local image descriptors aggregation
MM '11 Proceedings of the 19th ACM international conference on Multimedia
A multimedia retrieval framework based on automatic graded relevance judgments
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Direct modeling of image keypoints distribution through copula-based image signatures
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Hi-index | 0.00 |
Marginal Alphabets (MEDA) were proposed as an alternative to Bag of Words (BoW) for image representation. They aggregate sets of locally extracted descriptors (LEDs) by using visual alphabets based on the marginal approximation of the LED components. Compared to the exponential complexity of the BoW codebooks, the MEDA model is very efficient because each dimension of the LED is quantized independently. However, MEDA lacks of considering the relations between the LED components, loosing precious information for image representation. In this paper, we design Multi-MEDA, a shift-invariant kernel for MEDA signatures that allows to reintroduce, at a kernel level, the connections between LED components that were broken with the independent quantization. With our approach, we can derive in a polynomial time a multivariate model from the marginal approximations stored in the MEDA vector, without explicitly computing any multidimensional codebook. Results show that the MEDA signature increases its discriminative power when analyzed through the Multi-MEDA kernel evaluation. Moreover, we show that the model generated my the Multi-MEDA-based learning brings complementary information compared to traditional kernels over MEDA and BoW signatures: our experiments on the TRECVID database show that the combination of these approaches brings a substantial improvment compared to BoW-only classification.