Large-scale robust visual codebook construction

Authors:
Darui Li;Linjun Yang;Xian-Sheng Hua;Hong-Jiang Zhang
Affiliations:
University of Science and Technology of China, Hefei, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Advanced Technology Center, Beijing, China
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 5
Cited 7

Large-Scale Parallel Data Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets

IEEE Transactions on Knowledge and Data Engineering
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Data clustering: 50 years beyond K-means

Pattern Recognition Letters

Video-based image retrieval

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Exploring self-similarities of bag-of-features for image classification

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Million-scale near-duplicate video retrieval system

MM '11 Proceedings of the 19th ACM international conference on Multimedia
A unified context model for web image retrieval

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Spatial pooling of heterogeneous features for image applications

Proceedings of the 20th ACM international conference on Multimedia
Approximate gaussian mixtures for large scale vocabularies

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

The web-scale image retrieval system demands a large-scale visual codebook, which is difficult to be generated by the commonly adopted K-means vector quantization due to the applicability issue. While approximate K-means is proposed to scale up the visual codebook construction it needs to employ a high-precision approximate nearest neighbor search in the assignment step and is difficult to converge, which limits its scalability. In this paper, we propose an improved approximate K-means, by leveraging the assignment information in the history, namely the previous iterations, to improve the assignment precision. By further randomizing the employed approximate nearest neighbor search in each iteration, the proposed algorithm can improve the assignment precision conceptually similarly as the randomized k-d trees, while nearly no additional cost is introduced. The algorithm can be proved to be convergent and we demonstrate that the proposed algorithm improves the quality of the generated visual codebook as well as the scalability experimentally and analytically.