Optimization of robust loss functions for weakly-labeled image taxonomies: an imagenet case study

Authors:
Julian J. McAuley;Arnau Ramisa;Tibério S. Caetano
Affiliations:
NICTA, and the Australian National University;Institut de Robòtica i Informàtica Industrial (CSIC-UPC), Spain;NICTA, and the Australian National University
Venue:
EMMCVPR'11 Proceedings of the 8th international conference on Energy minimization methods in computer vision and pattern recognition
Year:
2011

Citing 11
Cited 0

Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
A scalable modular convex solver for regularized risk minimization

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning structural SVMs with latent variables

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
The Pascal Visual Object Classes (VOC) Challenge

International Journal of Computer Vision
Improving the fisher kernel for large-scale image classification

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
What does classifying more than 10,000 image categories tell us?

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Product Quantization for Nearest Neighbor Search

IEEE Transactions on Pattern Analysis and Machine Intelligence
Large-scale image classification: Fast feature extraction and SVM training

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
High-dimensional signature compression for large-scale image classification

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Attribute learning in large-scale datasets

ECCV'10 Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The recently proposed ImageNet dataset consists of several million images, each annotated with a single object category. However, these annotations may be imperfect, in the sense that many images contain multiple objects belonging to the label vocabulary. In other words, we have a multi-label problem but the annotations include only a single label (and not necessarily the most prominent). Such a setting motivates the use of a robust evaluation measure, which allows for a limited number of labels to be predicted and, as long as one of the predicted labels is correct, the overall prediction should be considered correct. This is indeed the type of evaluation measure used to assess algorithm performance in a recent competition on ImageNet data. Optimizing such types of performance measures presents several hurdles even with existing structured output learning methods. Indeed, many of the current state-of-the-art methods optimize the prediction of only a single output label, ignoring this 'structure' altogether. In this paper, we show how to directly optimize continuous surrogates of such performance measures using structured output learning techniques with latent variables. We use the output of existing binary classifiers as input features in a new learning stage which optimizes the structured loss corresponding to the robust performance measure. We present empirical evidence that this allows us to 'boost' the performance of existing binary classifiers which are the state-of-the-art for the task of object classification in ImageNet.