SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

  • Authors:
  • Hao Zhang;Alexander C. Berg;Michael Maire;Jitendra Malik

  • Affiliations:
  • Univ. of California, Berkeley;Univ. of California, Berkeley;Univ. of California, Berkeley;Univ. of California, Berkeley

  • Venue:
  • CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(卤0.56%) at 15 training images per class, and 66.23%(卤0.48%) at 30 training images.