Weakly supervised semantic segmentation with a multi-image model

  • Authors:
  • Alexander Vezhnevets;Vittorio Ferrari;Joachim M. Buhmann

  • Affiliations:
  • ETH Zurich, 8092, Switzerland;ETH Zurich, 8092, Switzerland;ETH Zurich, 8092, Switzerland

  • Venue:
  • ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel method for weakly supervised semantic segmentation. Training images are labeled only by the classes they contain, not by their location in the image. On test images instead, the method predicts a class label for every pixel. Our main innovation is a multi-image model (MIM) - a graphical model for recovering the pixel labels of the training images. The model connects superpixels from all training images in a data-driven fashion, based on their appearance similarity. For generalizing to new test images we integrate them into MIM using a learned multiple kernel metric, instead of learning conventional classifiers on the recovered pixel labels. We also introduce an "objectness" potential, that helps separating objects (e.g. car, dog, human) from background classes (e.g. grass, sky, road). In experiments on the MSRC 21 dataset and the LabelMe subset of [18], our technique outperforms previous weakly supervised methods and achieves accuracy comparable with fully supervised methods.