Learning to segment images using region-based perceptual features

  • Authors:
  • John Kaufhold;Anthony Hoogs

  • Affiliations:
  • Visualization and Computer Vision Laboratory, General Electric Global Research Center;Visualization and Computer Vision Laboratory, General Electric Global Research Center

  • Venue:
  • CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recent establishment of a large-scale ground-truth database of image segmentations [11] has enabled the development of learning approaches to the general segmentation problem. Using this database, we present an algorithm that learns how to segment images using region-based, perceptual features. The image is first densely segmented into regions and the edges between them using a variant of the Mumford-Shah functional. Each edge is classified as a boundary or non-boundary using a classifier trained on the ground-truth, resulting in an edge image estimating humandesignated boundaries. This novel approach has a few distinct advantages over filter-based methods such as local gradient operators. First, the same perceptual features can represent texture as well as regular structure. Second, the features can measure relationships between image elements at arbitrary distances in the image, enabling the detection of Gestalt properties at any scale. Third, texture boundaries can be precisely localized, which is difficult when using filter banks. Finally, the learning system outputs a relatively small set of intuitive perceptual rules for detecting boundaries. The classifier is trained on 200 images in the groundtruth database, and tested on another 100 images according to the benchmark evaluation methods. Edge classification improves the benchmark F-score from 0.54, for the initial Mumford-Shah-variant segmentation, to 0.61 on grayscale images. This increase of 13% demonstrates the versatility and representational power of our perceptual features, as the score exceeds published results for any algorithm restricted to one type of image feature such as texture or brightness gradient.