Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
One-Shot Learning of Object Categories
IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Learning task-specific similarity
Learning task-specific similarity
Improving the fisher kernel for large-scale image classification
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
BRIEF: binary robust independent elementary features
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
What does classifying more than 10,000 image categories tell us?
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Image classification using super-vector coding of local image descriptors
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Product Quantization for Nearest Neighbor Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sampling strategies for bag-of-features image classification
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Sparse image representation with epitomes
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Asymmetric distances for binary embeddings
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Adaptive deconvolutional networks for mid and high level feature learning
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Hi-index | 0.00 |
Many state-of-the-art methods in object recognition extract features from an image and encode them, followed by a pooling step and classification. Within this processing pipeline, often the encoding step is the bottleneck, for both computational efficiency and performance. We present a novel assignment-based encoding formulation. It allows for the fusion of assignment-based encoding and sparse coding into one formulation. We also use this to design a new, very efficient, encoding. At the heart of our formulation lies a quantization into a set of k-sparse vectors, which we denote as sparse quantization. We design the new encoding as two nested, sparse quantizations. Its efficiency stems from leveraging bit-wise representations. In a series of experiments on standard recognition benchmarks, namely Caltech 101, PASCAL VOC 07 and ImageNet, we demonstrate that our method achieves results that are competitive with the state-of-the-art, and requires orders of magnitude less time and memory. Our method is able to encode one million images using 4 CPUs in a single day, while maintaining a good performance.