The truth about cats and dogs

  • Authors:
  • Omkar M. Parkhi;Andrea Vedaldi;C. V. Jawahar;Andrew Zisserman

  • Affiliations:
  • Center for Visual Information Technology, International Institute of Information Technology, Hyderabad 500032, India;Department of Engineering Science, University of Oxford, United Kingdom;Center for Visual Information Technology, International Institute of Information Technology, Hyderabad 500032, India;Department of Engineering Science, University of Oxford, United Kingdom

  • Venue:
  • ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Template-based object detectors such as the deformable parts model of Felzenszwalb et al. [11] achieve state-of-the-art performance for a variety of object categories, but are still outperformed by simpler bag-of-words models for highly flexible objects such as cats and dogs. In these cases we propose to use the template-based model to detect a distinctive part for the class, followed by detecting the rest of the object via segmentation on image specific information learnt from that part. This approach is motivated by two ob- servations: (i) many object classes contain distinctive parts that can be detected very reliably by template-based detec- tors, whilst the entire object cannot; (ii) many classes (e.g. animals) have fairly homogeneous coloring and texture that can be used to segment the object once a sample is provided in an image. We show quantitatively that our method substantially outperforms whole-body template-based detectors for these highly deformable object categories, and indeed achieves accuracy comparable to the state-of-the-art on the PASCAL VOC competition, which includes other models such as bag-of-words.