Efficient and Effective Visual Codebook Generation Using Additive Kernels

Authors:
Jianxin Wu;Wei-Chian Tan;James M. Rehg
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 23
Cited 1

Color indexing

International Journal of Computer Vision
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Object Categorization by Learned Universal Visual Dictionary

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Estimating the Support of a High-Dimensional Distribution

Neural Computation
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Semantic Modeling of Natural Scenes for Content-Based Image Retrieval

International Journal of Computer Vision
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Multilevel Image Coding with Hyperfeatures

International Journal of Computer Vision
Universal and Adapted Vocabularies for Generic Visual Categorization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Randomized Clustering Forests for Image Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Kernel Codebooks for Scene Categorization

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Supervised Learning of Quantizer Codebooks by Information Loss Minimization

IEEE Transactions on Pattern Analysis and Machine Intelligence
A fast dual method for HIK SVM learning

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
CENTRIST: A Visual Descriptor for Scene Categorization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Building kernels from binary strings for image matching

IEEE Transactions on Image Processing

Object templates for visual place categorization

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

Common visual codebook generation methods used in a bag of visual words model, for example, k-means or Gaussian Mixture Model, use the Euclidean distance to cluster features into visual code words. However, most popular visual descriptors are histograms of image measurements. It has been shown that with histogram features, the Histogram Intersection Kernel (HIK) is more effective than the Euclidean distance in supervised learning tasks. In this paper, we demonstrate that HIK can be used in an unsupervised manner to significantly improve the generation of visual codebooks. We propose a histogram kernel k-means algorithm which is easy to implement and runs almost as fast as the standard k-means. The HIK codebooks have consistently higher recognition accuracy over k-means codebooks by 2-4% in several benchmark object and scene recognition data sets. The algorithm is also generalized to arbitrary additive kernels. Its speed is thousands of times faster than a naive implementation of the kernel k-means algorithm. In addition, we propose a one-class SVM formulation to create more effective visual code words. Finally, we show that the standard k-median clustering method can be used for visual codebook generation and can act as a compromise between the HIK / additive kernel and the k-means approaches.