On vocabulary size in bag-of-visual-words representation

Authors:
Jian Hou;Jianxin Kang;Naiming Qi
Affiliations:
School of Astronautics, Harbin Institute of Technology, Harbin, China;School of Astronautics, Harbin Institute of Technology, Harbin, China and School of Engineering, Northeast Agriculture University, Harbin, China;School of Astronautics, Harbin Institute of Technology, Harbin, China
Venue:
PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Year:
2010

Citing 23
Cited 1

The Design and Use of Steerable Filters

IEEE Transactions on Pattern Analysis and Machine Intelligence
Shape Matching and Object Recognition Using Shape Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Affine/ Photometric Invariants for Planar Intensity Patterns

ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume I - Volume I
Selection of Scale-Invariant Parts for Object Class Recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Multi-Image Matching Using Multi-Scale Oriented Patches

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Discriminative Training for Object Recognition Using Image Patches

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
A Comparison of Affine Region Detectors

International Journal of Computer Vision
Multiple Object Class Detection with a Generative Model

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Hierarchical building recognition

Image and Vision Computing
Evaluating bag-of-visual-words representations in scene classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
Features for image retrieval: an experimental comparison

Information Retrieval
Learning Optimal Compact Codebook for Efficient Object Categorization

WACV '08 Proceedings of the 2008 IEEE Workshop on Applications of Computer Vision
NUS-WIDE: a real-world web image database from National University of Singapore

Proceedings of the ACM International Conference on Image and Video Retrieval
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Keyframe retrieval by keypoints: can point-to-point matching help?

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Image matching based on representative local descriptors

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling

Towards a universal and limited visual vocabulary

ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bag-of-visual-words is a popular image representation that produces high matching accuracy and efficiency. While vocabulary size impacts on matching accuracy, existing research usually selects the vocabulary size empirically. Research on representative local descriptors shows that with similarity based clustering, the intra-cluster similarity extent of descriptors plays the same role in straightforward matching as vocabulary size in visual words matching. Based on this observation, we propose to use similarity based clustering to determine the optimal vocabulary size for a given dataset in visual words matching. Preliminary experiments with three datasets produce encouraging results and demonstrate the potential of the proposed approach.