Evaluation of pooling operations in convolutional architectures for object recognition

  • Authors:
  • Dominik Scherer;Andreas Müller;Sven Behnke

  • Affiliations:
  • University of Bonn, Institute of Computer Science VI, Bonn, Germany;University of Bonn, Institute of Computer Science VI, Bonn, Germany;University of Bonn, Institute of Computer Science VI, Bonn, Germany

  • Venue:
  • ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

A common practice to gain invariant features in object recognition models is to aggregate multiple low-level features over a small neighborhood. However, the differences between those models makes a comparison of the properties of different aggregation functions hard. Our aim is to gain insight into different functions by directly comparing them on a fixed architecture for several common object recognition tasks. Empirical results show that a maximum pooling operation significantly outperforms subsampling operations. Despite their shift-invariant properties, overlapping pooling windows are no significant improvement over nonoverlapping pooling windows. By applying this knowledge, we achieve state-of-the-art error rates of 4.57% on the NORB normalized-uniform dataset and 5.6% on the NORB jittered-cluttered dataset.