Interpretation of complex scenes using dynamic tree-structure Bayesian networks

  • Authors:
  • Sinisa Todorovic;Michael C. Nechyba

  • Affiliations:
  • Computer Vision and Robotics Laboratory, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA;Pittsburgh Pattern Recognition, Inc. 40 24th Street, Suite 240, Pittsburgh, PA 15222, USA

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of object detection and recognition in complex scenes, where objects are partially occluded. The approach presented herein is based on the hypothesis that a careful analysis of visible object details at various scales is critical for recognition in such settings. In general, however, computational complexity becomes prohibitive when trying to analyze multiple sub-parts of multiple objects in an image. To alleviate this problem, we propose a generative-model framework-namely, dynamic tree-structure belief networks (DTSBNs). This framework formulates object detection and recognition as inference of DTSBN structure and image-class conditional distributions, given an image. The causal (Markovian) dependencies in DTSBNs allow for design of computationally efficient inference, as well as for interpretation of the estimated structure as follows: each root represents a whole distinct object, while children nodes down the sub-tree represent parts of that object at various scales. Therefore, within the DTSBN framework, the treatment and recognition of object parts requires no additional training, but merely a particular interpretation of the tree/subtree structure. This property leads to a strategy for recognition of objects as a whole through recognition of their visible parts. Our experimental results demonstrate that this approach remarkably outperforms strategies without explicit analysis of object parts.