Human pose estimation from a single view point

Authors:
Gerard Medioni;Matheen Siddiqui
Affiliations:
University of Southern California;University of Southern California
Venue:
Human pose estimation from a single view point
Year:
2009

Citing 0
Cited 3

Assistive systems in production environments: exploring motion recognition and gamification

Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments
Multi-modal user interaction method based on gaze tracking and gesture recognition

Image Communication
Human pose estimation from depth image using visibility estimation and key points

DHM'13 Proceedings of the 4th international conference on Digital Human Modeling and Applications in Health, Safety, Ergonomics, and Risk Management: human body modeling and ergonomics - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the estimation of human poses from a single view point in images and sequences. This is an important problem with a range of applications in human computer interaction, security and surveillance monitoring, image understanding, and motion capture. In this work we develop methods that make use of single view cameras, stereo, and range sensors.First, we develop a 2D limb tracking scheme in color images using skin color and edge information. Multiple 2D limb models are used to enhance tracking of the underlying 3D structure. This includes models for lateral forearm views (waving) as well as for pointing gestures.In our color image pose tracking framework, we find candidate 2D articulated model configurations by searching for locally optimal configurations under a weak but computationally manageable fitness function. By parameterizing 2D poses by their joint locations organized in a tree structure, candidates can be efficiently and exhaustively localized in a bottom-up manner. We then adapt this algorithm for use on sequences and develop methods to automatically construct a fitness function from annotated image data.With a stereo camera, we use depth data to track the movement of a user using an articulated upper body model. We define an objective function that evaluates the saliency of this upper body model with a stereo depth image and track the arms of a user by numerically maintaining the optimum using an annealed particle filter.In range sensors, we use a DDMCMC approach to find an optimal pose based on a likelihood that compares synthesized and observed depth images. To speed up convergence of this search, we make use of bottom up detectors that generate candidate part locations. Our Markov chain dynamics explore solutions about these parts and thus combine bottom up and top down processing. The current performance is 10fps and we provide quantitative performance evaluation using hand annotated data. We demonstrate significant improvement over a baseline ICP approach. This algorithm is then adapted to estimate the specific shape parameters of subjects for use in tracking.