Bottom-up recognition and parsing of the human body

  • Authors:
  • Praveen Srinivasan;Jianbo Shi

  • Affiliations:
  • GRASP Lab, University of Pennsylvania, Philadelphia, PA;GRASP Lab, University of Pennsylvania, Philadelphia, PA

  • Venue:
  • EMMCVPR'07 Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recognizing humans, estimating their pose and segmenting their body parts are key to high-level image understanding. Because humans are highly articulated, the range of deformations they undergo makes this task extremely challenging. Previous methods have focused largely on heuristics or pairwise part models in approaching this problem. We propose a bottom-up growing, similar to parsing, of increasingly more complete partial body masks guided by a set of parse rules. At each level of the growing process, we evaluate the partial body masks directly via shape matching with exemplars (and also image features), without regard to how the hypotheses are formed. The body is evaluated as a whole, not the sum of its parts, unlike previous approaches. Multiple image segmentations are included at each of the levels of the growing/parsing, to augment existing hypotheses or to introduce ones. Our method yields both a pose estimate as well as a segmentation of the human. We demonstrate competitive results on this challenging task with relatively few training examples on a dataset of baseball players with wide pose variation. Our method is comparatively simple and could be easily extended to other objects. We also give a learning framework for parse ranking that allows us to keep fewer parses for similar performance.