Fast and efficient visual codebook construction for multi-label annotation using predictive clustering trees

Authors:
Ivica Dimitrovski;Dragi Kocev;Suzana Loskovska;Sašo Deroski
Affiliations:
-;-;-;-
Venue:
Pattern Recognition Letters
Year:
2014

Citing 16
Cited 0

Bagging predictors

Machine Learning
Random Forests

Machine Learning
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
A note on Platt's probabilistic outputs for support vector machines

Machine Learning
Randomized Clustering Forests for Image Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Real-time bag of words, approximately

Proceedings of the ACM International Conference on Image and Video Retrieval
Comparing compact codebooks for visual categorization

Computer Vision and Image Understanding
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Overview of the CLEF 2009 large-scale visual concept detection and annotation task

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
ImageCLEF@ICPR Contest: Challenges, Methodologies and Results of the Photo Annotation Task

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Tree ensembles for predicting structured outputs

Pattern Recognition

Quantified Score

Hi-index	0.10

Visualization

Abstract

The bag-of-visual-words approach to represent images is very popular in the image annotation community. A crucial part of this approach is the construction of visual codebook. The visual codebook is typically constructed by using a clustering algorithm (most often k-means) to cluster hundreds of thousands of local descriptors/key-points into several thousands of visual words. Given the large numbers of examples and clusters, the clustering algorithm is a bottleneck in the construction of bag-of-visual-words representations of images. To alleviate this bottleneck, we propose to construct the visual codebook by using predictive clustering trees (PCTs) for multi-label classification (MLC). Such a PCT is able to assign multiple labels to a given image, i.e., to completely annotate a given image. Given that PCTs (and decision trees in general) are unstable predictive models, we propose to use a random forest of PCTs for MLC to produce the overall visual codebook. Our hypothesis is that the PCTs for MLC can exploit the connections between the labels and thus produce a visual codebook with better discriminative power. We evaluate our approach on three relevant image databases. We compare the efficiency and the discriminative power of the proposed approach to the literature standard - k-means clustering. The results reveal that our approach is much more efficient in terms of computational time and produces a visual codebook with better discriminative power as compared to k-means clustering. The scalability of the proposed approach allows us to construct visual codebooks using more than usually local descriptors thus further increasing its discriminative power.