Learning compositional categorization models

  • Authors:
  • Björn Ommer;Joachim M. Buhmann

  • Affiliations:
  • Institute of Computational Science, ETH Zurich, Zurich, Switzerland;Institute of Computational Science, ETH Zurich, Zurich, Switzerland

  • Venue:
  • ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part III
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

This contribution proposes a compositional approach to visual object categorization of scenes. Compositions are learned from the Caltech 101 database and form intermediate abstractions of images that are semantically situated between low-level representations and the high-level categorization. Salient regions, which are described by localized feature histograms, are detected as image parts. Subsequently compositions are formed as bags of parts with a locality constraint. After performing a spatial binding of compositions by means of a shape model, coupled probabilistic kernel classifiers are applied thereupon to establish the final image categorization. In contrast to the discriminative training of the categorizer, intermediate compositions are learned in a generative manner yielding relevant part agglomerations, i.e. groupings which are frequently appearing in the dataset while simultaneously supporting the discrimination between sets of categories. Consequently, compositionality simplifies the learning of a complex categorization model for complete scenes by splitting it up into simpler, sharable compositions. The architecture is evaluated on the highly challenging Caltech 101 database which exhibits large intra-category variations. Our compositional approach shows competitive retrieval rates in the range of 53.6 ± 0.88% or, with a multi-scale feature set, rates of 57.8 ± 0.79%.