The truth about cats and dogs

Authors:
Omkar M. Parkhi;Andrea Vedaldi;C. V. Jawahar;Andrew Zisserman
Affiliations:
Center for Visual Information Technology, International Institute of Information Technology, Hyderabad 500032, India;Department of Engineering Science, University of Oxford, United Kingdom;Center for Visual Information Technology, International Institute of Information Technology, Hyderabad 500032, India;Department of Engineering Science, University of Oxford, United Kingdom
Venue:
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Year:
2011

Citing 0
Cited 8

Dog breed classification using part localization

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Object detection using strongly-supervised deformable part models

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Diagnosing error in object detectors

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Multi-component models for object detection

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Viewpoint based mobile robotic exploration aiding object search in indoor environment

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Neti Neti: in search of deity

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Arbitrary-Shape object localization using adaptive image grids

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Cross-modal alignment for wildlife recognition

Proceedings of the 2nd ACM international workshop on Multimedia analysis for ecological data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Template-based object detectors such as the deformable parts model of Felzenszwalb et al. [11] achieve state-of-the-art performance for a variety of object categories, but are still outperformed by simpler bag-of-words models for highly flexible objects such as cats and dogs. In these cases we propose to use the template-based model to detect a distinctive part for the class, followed by detecting the rest of the object via segmentation on image specific information learnt from that part. This approach is motivated by two ob- servations: (i) many object classes contain distinctive parts that can be detected very reliably by template-based detec- tors, whilst the entire object cannot; (ii) many classes (e.g. animals) have fairly homogeneous coloring and texture that can be used to segment the object once a sample is provided in an image. We show quantitatively that our method substantially outperforms whole-body template-based detectors for these highly deformable object categories, and indeed achieves accuracy comparable to the state-of-the-art on the PASCAL VOC competition, which includes other models such as bag-of-words.