Cartesian K-Means

Authors:
Mohammad Norouzi;David J. Fleet
Affiliations:
-;-
Venue:
CVPR '13 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2013

Citing 0
Cited 1

Topology preserving hashing for similarity search

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and run-time cost associated with the large numbers of clusters required to keep quantization errors small and model fidelity high. We develop new models with a compositional parameterization of cluster centers, so representational capacity increases super-linearly in the number of parameters. This allows one to effectively quantize data using billions or trillions of centers. We formulate two such models, Orthogonal k-means and Cartesian k-means. They are closely related to one another, to k-means, to methods for binary hash function optimization like ITQ (Gong and Lazebnik, 2011), and to Product Quantization for vector quantization (Jegou et al., 2011). The models are tested on large-scale ANN retrieval tasks (1M GIST, 1B SIFT features), and on codebook learning for object recognition (CIFAR-10).