Learning articulated appearance models for tracking humans: A spectral graph matching approach

  • Authors:
  • Nicolas Thome;Djamel Merad;Serge Miguet

  • Affiliations:
  • Laboratoire d'InfoRmatique en Images et Systèmes d'information, Université Lumière Lyon 2, CNRS 5, Avenue Pierre Mendès France, 69576 Bron Cedex, France and Laboratoire d'Infor ...;Laboratoire d'Informatique de Paris 6, Université Pierre & Marie Curie, CNRS 104, Avenue du President Kennedy, 75016 Paris, France;Laboratoire d'Informatique de Paris 6, Université Pierre & Marie Curie, CNRS 104, Avenue du President Kennedy, 75016 Paris, France

  • Venue:
  • Image Communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Tracking an unspecified number of people in real-time is one of the most challenging tasks in computer vision. In this paper, we propose an original method to achieve this goal, based on the construction of a 2D human appearance model. The general framework, which is a region-based tracking approach, is applicable to any type of object. We show how to specialize the method for taking advantage of the structural properties of the human body. We segment its visible parts by using a skeletal graph matching strategy inspired by the shock graphs. Only morphological and topological information is encoded in the model graph, making the approach independent of the pose of the person, the viewpoint, the geometry or the appearance of the limbs. The limbs labeling makes it possible to build and update an appearance model for each body part. The resulting discriminative feature, that we denote as an articulated appearance model, captures both color, texture and shape properties of the different limbs. It is used to identify people in complex situations (occlusion, field of view exit, etc.), and maintain the tracking. The model to image matching has proved to be much more robust and better-founded than with existing global appearance descriptors, specifically when dealing with highly deformable objects such as humans. The only assumption for the recognition is the approximate viewpoint correspondence between the different models during the matching process. The method does not make use of skin color detection, which allows us to perform tracking under any viewpoint. Occlusions can be detected by the generic part of the algorithm, and the tracking is performed in such cases by means of a particle filter. Several results in complex situations prove the capacity of the algorithm to learn people appearance in unspecified poses and viewpoints, and its efficiency for tracking multiple humans in real-time using the specific updated descriptors. Finally, the model provides an important clue for further human motion analysis process.