Weighting informativeness of bag-of-visual-words by kernel optimization for video concept detection

Authors:
Feng Wang;Bernard Merialdo
Affiliations:
East China Normal University, Shanghai, China;Institute Eurecom, Sophia Antipolis, France
Venue:
Proceedings of the international workshop on Very-large-scale multimedia corpus, mining and retrieval
Year:
2010

Citing 5
Cited 0

Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Gradient-Based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Towards optimal bag-of-features for object categorization and semantic video retrieval

Proceedings of the 6th ACM international conference on Image and video retrieval
Bag-of-visual-words expansion using visual relatedness for video indexing

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bag-of-Visual-Words (BoW) feature has been demonstrated effective and widely used in video concept detection due to its discriminative ability by capturing the local information in images. In the current approaches, all the words in the visual vocabulary are treated equally for the detection of different concepts. This cannot highlight the concept-specific visual information, and thus limits the discriminative ability of BoW feature. In this paper, we propose an approach to boost the performance of video concept detection based on BoW. This is achieved by assigning different weights to the visual words according to their informativeness for the detection of different concepts. Kernel alignment score (KAS) is used to measure the discriminative ability of SVM kernels, and the visual words are weighted as a kernel optimization problem. We show that the SVMs based on weighted visual words with our approach outperform the uniformly weighting and TF-IDF weighting schemes, and the MAP for the 20 concepts from TRECVID 2009 high-level feature extraction is significantly improved.