Descriptor learning for efficient retrieval

  • Authors:
  • James Philbin;Michael Isard;Josef Sivic;Andrew Zisserman

  • Affiliations:
  • Department of Engineering Science, University of Oxford;Microsoft Research, Silicon Valley;INRIA, WILLOW, Laboratoire d'Informatique de l'Ecole Normale Superieure, Paris;Department of Engineering Science, University of Oxford, Paris

  • Venue:
  • ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many visual search and matching systems represent images using sparse sets of "visual words": descriptors that have been quantized by assignment to the best-matching symbol in a discrete vocabulary. Errors in this quantization procedure propagate throughout the rest of the system, either harming performance or requiring correction using additional storage or processing. This paper aims to reduce these quantization errors at source, by learning a projection from descriptor space to a new Euclidean space in which standard clustering techniques are more likely to assign matching descriptors to the same cluster, and nonmatching descriptors to different clusters. To achieve this, we learn a non-linear transformation model by minimizing a novel margin-based cost function, which aims to separate matching descriptors from two classes of non-matching descriptors. Training data is generated automatically by leveraging geometric consistency. Scalable, stochastic gradient methods are used for the optimization. For the case of particular object retrieval, we demonstrate impressive gains in performance on a ground truth dataset: our learnt 32-D descriptor without spatial re-ranking outperforms a baseline method using 128-D SIFT descriptors with spatial re-ranking.