Combining Generative and Discriminative Models in a Framework for Articulated Pose Estimation

Authors:
Rómer Rosales;Stan Sclaroff
Affiliations:
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA 02139;Image and Video Computing Group, Dept. of Computer Science, Boston University, Boston, USA 02215
Venue:
International Journal of Computer Vision
Year:
2006

Citing 22
Cited 9

Elements of information theory

Elements of information theory
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Information geometry of the EM and em algorithms for neural networks

Neural Networks
Pfinder: Real-Time Tracking of the Human Body

IEEE Transactions on Pattern Analysis and Machine Intelligence
CONDENSATION—Conditional Density Propagation forVisual Tracking

International Journal of Computer Vision
Learning in graphical models

Learning in graphical models
Introduction to Monte Carlo methods

Learning in graphical models
A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
A hierarchical community of experts

Learning in graphical models
Digital Pattern Recognition by Moments

Journal of the ACM (JACM)
Reconstruction of articulated objects from point correspondences in a single uncalibrated image

Computer Vision and Image Understanding
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference

Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
Simulation and the Monte Carlo Method

Simulation and the Monte Carlo Method
Modeling Visual Patterns by Integrating Descriptive and Generative Methods

International Journal of Computer Vision
Learning Parameterized Models of Image Motion

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Tracking People with Twists and Exponential Maps

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Towards 3D hand tracking using a deformable model

FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
Hand Gesture Estimation and Model Refinement Using Monocular Camera - Ambiguity Limitation by Inequality Constraints

FG '98 Proceedings of the 3rd. International Conference on Face & Gesture Recognition
Model-based tracking of self-occluding articulated objects

ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
Ghost: A Human Body Part Labeling System Using Silhouettes

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Shadow Puppetry

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
AIS-BN: an adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks

Journal of Artificial Intelligence Research

Live 3D Video in Soccer Stadium

International Journal of Computer Vision
Tracking articulated objects by learning intrinsic structure of motion

Pattern Recognition Letters
Action recognition feedback-based framework for human pose reconstruction from monocular images

Pattern Recognition Letters
Silhouette representation and matching for 3D pose discrimination - A comparative study

Image and Vision Computing
MovieReshape: tracking and reshaping of humans in videos

ACM SIGGRAPH Asia 2010 papers
Monocular human pose tracking using multi frame part dynamics

WMVC'09 Proceedings of the 2009 international conference on Motion and video computing
3D human pose recovery from image by efficient visual feature selection

Computer Vision and Image Understanding
Discriminative fusion of shape and appearance features for human pose estimation

Pattern Recognition
Exploiting projective geometry for view-invariant monocular human motion analysis in man-made environments

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

We develop a method for the estimation of articulated pose, such as that of the human body or the human hand, from a single (monocular) image. Pose estimation is formulated as a statistical inference problem, where the goal is to find a posterior probability distribution over poses as well as a maximum a posteriori (MAP) estimate. The method combines two modeling approaches, one discriminative and the other generative. The discriminative model consists of a set of mapping functions that are constructed automatically from a labeled training set of body poses and their respective image features. The discriminative formulation allows for modeling ambiguous, one-to-many mappings (through the use of multi-modal distributions) that may yield multiple valid articulated pose hypotheses from a single image. The generative model is defined in terms of a computer graphics rendering of poses. While the generative model offers an accurate way to relate observed (image features) and hidden (body pose) random variables, it is difficult to use it directly in pose estimation, since inference is computationally intractable. In contrast, inference with the discriminative model is tractable, but considerably less accurate for the problem of interest. A combined discriminative/generative formulation is derived that leverages the complimentary strengths of both models in a principled framework for articulated pose inference. Two efficient MAP pose estimation algorithms are derived from this formulation; the first is deterministic and the second non-deterministic. Performance of the framework is quantitatively evaluated in estimating articulated pose of both the human hand and human body.