Context-dependent segmentation and matching in image databases

  • Authors:
  • Hayit Greenspan;Guy Dvir;Yossi Rubner

  • Affiliations:
  • Faculty of Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel;Faculty of Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel;Rubner Technology Consulting, Israel

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The content of an image can be summarized by a set of homogeneous regions in an appropriate feature space. When exact shape is not important, the regions can be represented by simple "blobs." Even for similar images, the blob representation of the two images might vary in shape, position, the number of blobs, and the represented features. In addition, separate blobs in one image might correspond to a single blob in the other image and vice versa. In this paper we present the BlobEMD framework as a novel method to compute the dissimilarity of two sets of blobs while allowing for context-based adaptation of the image representation. This results in representations that represent well the original images but at the same time are best aligned with respect to the representations of the context images. Similarly, we can perform image segmentation where the segmentation of an image is guided by a reference image. This novel approach makes segmentation a context-based task. We compute the blobs by using Gaussian mixture modeling and use the Earth mover's distance (EMD) to compute both the dissimilarity of the images and the flow-matrix of the blobs between the images. The Blob-EMD flow-matrix is used to find optimal correspondences between source and target image representations and to adapt the representation of the source image to that of the target image. This allows for similarity measures between images that are insensitive to the segmentation process and to different levels of details of the representation. We show applications of this method for content-based image retrieval, image segmentation, and matching models of heavily dithered images with models of full resolution images.