Scalar quantization for large scale image search

Authors:
Wengang Zhou;Yijuan Lu;Houqiang Li;Qi Tian
Affiliations:
University of Texas at San Antonio, San Antonio, TX, USA;Texas State University, San Macros, TX, USA;University of Science and Technology of China, Hefei, China;University of Texas at San Antonio, San Antoniot, TX, USA
Venue:
Proceedings of the 20th ACM international conference on Multimedia
Year:
2012

Citing 18
Cited 4

K-d trees for semidynamic point sets

SCG '90 Proceedings of the sixth annual symposium on Computational geometry
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

Communications of the ACM
Modern Information Retrieval

Modern Information Retrieval
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Query expansion for hash-based image object retrieval

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Building contextual visual vocabulary for large-scale image applications

Proceedings of the international conference on Multimedia
Spatial coding for large scale partial-duplicate web image search

Proceedings of the international conference on Multimedia
BRIEF: binary robust independent elementary features

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Large scale image search with geometric coding

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Asymmetric hamming embedding: taking the best of our bits for large scale image search

MM '11 Proceedings of the 19th ACM international conference on Multimedia
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications

IEEE Transactions on Image Processing
ORB: An efficient alternative to SIFT or SURF

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Binary SIFT: towards efficient feature matching verification for image search

Proceedings of the 4th International Conference on Internet Multimedia Computing and Service
Topology preserving hashing for similarity search

Proceedings of the 21st ACM international conference on Multimedia
Scale based region growing for scene text detection

Proceedings of the 21st ACM international conference on Multimedia
Improved binary feature matching through fusion of hamming distance and fragile bit weight

Proceedings of the 3rd ACM international workshop on Interactive multimedia on mobile & portable devices

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bag-of-Words (BoW) model based on SIFT has been widely used in large scale image retrieval applications. Feature quantization plays a crucial role in BoW model, which generates visual words from the high dimensional SIFT features, so as to adapt to the inverted file structure for indexing. Traditional feature quantization approaches suffer several problems: 1) high computational cost---visual words generation (codebook construction) is time consuming especially with large amount of features; 2) limited reliability---different collections of images may produce totally different codebooks and quantization error is hard to be controlled; 3) update inefficiency--once the codebook is constructed, it is not easy to be updated. In this paper, a novel feature quantization algorithm, scalar quantization, is proposed. With scalar quantization, a SIFT feature is quantized to a descriptive and discriminative bit-vector, of which the first tens of bits are taken out as code word. Our quantizer is independent of collections of images. In addition, the result of scalar quantization naturally lends itself to adapt to the classic inverted file structure for image indexing. Moreover, the quantization error can be flexibly reduced and controlled by efficiently enumerating nearest neighbors of code words. The performance of scalar quantization has been evaluated in partial-duplicate Web image search on a database of one million images. Experiments reveal that the proposed scalar quantization achieves a relatively 42% improvement in mean average precision over the baseline (hierarchical visual vocabulary tree approach), and also outperforms the state-of-the-art Hamming Embedding approach and soft assignment method.