Occlusion boundaries: low-level detection to high-level reasoning

  • Authors:
  • Martial Hebert;Andrew Neil Stein

  • Affiliations:
  • Carnegie Mellon University;Carnegie Mellon University

  • Venue:
  • Occlusion boundaries: low-level detection to high-level reasoning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The boundaries of objects in an image are often considered a nuisance to be “handled” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different physical surfaces, is invariably and erroneously considered together. In addition, these boundaries convey important perceptual information about 3D scene structure and shape. Consequently, their identification can benefit many different computer vision pursuits, from low-level processing techniques to high-level reasoning tasks.While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In this thesis, we focus on the exploitation of subtle relative-motion cues present at occlusion boundaries. When combined with more standard appearance information, we demonstrate these cues' utility in detecting occlusion boundaries locally. We also present a novel, mid-level model for reasoning more globally about object boundaries and propagating such local information to extract improved, extended boundaries. Building on these methods, we also demonstrate enhancement of two high-level vision tasks by incorporating boundary information. First we employ boundary fragments to suggest multiple “hints” of a scene segmentation and then use these suggestions collectively to achieve more consistent and parsimonious delineation of generic whole objects. Second, we augment a popular feature-based recognition technique for specific objects (the Scale Invariant Feature Transform) with boundary information in order to yield a method more robust to changes in background and scale.This thesis thus contributes to research on occlusion at several levels, from low-level motion estimation and feature extraction; to mid-level reasoning, classification, and propagation; and finally to high-level segmentation and recognition. In addition, a new video dataset is presented to enable further research in this area.