Comparing state-of-the-art visual features on invariant object recognition tasks

  • Authors:
  • Nicolas Pinto;Youssef Barhomi;David D. Cox;James J. DiCarlo

  • Affiliations:
  • Massachusetts Institute of Technology, Cambridge, MA, U.S.A;Massachusetts Institute of Technology, Cambridge, MA, U.S.A;The Rowland Institute at Harvard, Cambridge, MA, U.S.A;Massachusetts Institute of Technology, Cambridge, MA, U.S.A

  • Venue:
  • WACV '11 Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tolerance (“invariance”) to identity-preserving image variation (e.g. variation in position, scale, pose, illumination) is a fundamental problem that any visual object recognition system, biological or engineered, must solve. While standard natural image database benchmarks are useful for guiding progress in computer vision, they can fail to probe the ability of a recognition system to solve the invariance problem [23, 24, 25]. Thus, to understand which computational approaches are making progress on solving the invariance problem, we compared and contrasted a variety of state-of-the-art visual representations using synthetic recognition tasks designed to systematically probe invari-ance. We successfully re-implemented a variety of state-of-the-art visual representations and confirmed their published performance on a natural image benchmark. We here report that most of these representations perform poorly on invariant recognition, but that one representation [21] shows significant performance gains over two baseline representations. We also show how this approach can more deeply illuminate the strengths and weaknesses of different visual representations and thus guide progress on invariant object recognition.