Optimization of robust loss functions for weakly-labeled image taxonomies: an imagenet case study
EMMCVPR'11 Proceedings of the 8th international conference on Energy minimization methods in computer vision and pattern recognition
Metric learning for large scale image classification: generalizing to new classes at near-zero cost
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Learning compact visual attributes for large-scale image classification
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Sparse discriminative Fisher vectors in visual classification
Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
VISOR: towards on-the-fly large-scale object category retrieval
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Image retrieval using eigen queries
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Large scale visual classification with many classes
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Image Classification with the Fisher Vector: Theory and Practice
International Journal of Computer Vision
Boosted kernel for image categorization
Multimedia Tools and Applications
Hi-index | 0.00 |
We address image classification on a large-scale, i.e. when a large number of images and classes are involved. First, we study classification accuracy as a function of the image signature dimensionality and the training set size. We show experimentally that the larger the training set, the higher the impact of the dimensionality on the accuracy. In other words, high-dimensional signatures are important to obtain state-of-the-art results on large datasets. Second, we tackle the problem of data compression on very large signatures (on the order of 10^5 dimensions) using two lossy compression strategies: a dimensionality reduction technique known as the hash kernel and an encoding technique based on product quantizers. We explain how the gain in storage can be traded against a loss in accuracy and/or an increase in CPU cost. We report results on two large databases - ImageNet and a dataset of lM Flickr images - showing that we can reduce the storage of our signatures by a factor 64 to 128 with little loss in accuracy. Integrating the decompression in the classifier learning yields an efficient and scalable training algorithm. On ILSVRC2010 we report a 74.3% accuracy at top-5, which corresponds to a 2.5% absolute improvement with respect to the state-of-the-art. On a subset of 10K classes of ImageNet we report a top-1 accuracy of 16.7%, a relative improvement of 160% with respect to the state-of-the-art.