Learning Joint Top-Down and Bottom-up Processes for 3D Visual Inference

Authors:
Cristian Sminchisescu;Atul Kanaujia;Dimitris Metaxas
Affiliations:
TTI-C;Rutgers University;Rutgers University
Venue:
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Year:
2006

Citing 0
Cited 22

A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Conditional models for contextual human motion recognition

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Vision-based human motion analysis: An overview

Computer Vision and Image Understanding
BM3E: Discriminative Density Propagation for Visual Tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence
Monocular 3D tracking of articulated human motion in silhouette and pose manifolds

Journal on Image and Video Processing - Anthropocentric Video Analysis: Tools and Applications
Action recognition feedback-based framework for human pose reconstruction from monocular images

Pattern Recognition Letters
Twin Gaussian Processes for Structured Prediction

International Journal of Computer Vision
Silhouette representation and matching for 3D pose discrimination - A comparative study

Image and Vision Computing
Shared latent dynamical model for human tracking from videos

MCAM'07 Proceedings of the 2007 international conference on Multimedia content analysis and mining
Nonparametric density estimation with adaptive, anisotropic kernels for human motion tracking

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
Markerless human articulated tracking using hierarchical particle swarm optimisation

Image and Vision Computing
Multiple view human articulated tracking using charting and particle swarm optimisation

Proceedings of the 1st international workshop on 3D video processing
3D human pose recovery from image by efficient visual feature selection

Computer Vision and Image Understanding
Dimensionality reduction using a Gaussian Process Annealed Particle Filter for tracking and classification of articulated body motions

Computer Vision and Image Understanding
Latent gaussian mixture regression for human pose estimation

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation

International Journal of Computer Vision
Fast Human Pose Detection Using Randomized Hierarchical Cascades of Rejectors

International Journal of Computer Vision
Combining information theoretic kernels with generative embeddings for classification

Neurocomputing
No bias left behind: covariate shift adaptation for discriminative 3d pose estimation

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Pose estimation with motionlet LLC coding

PCM'12 Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing
Editor's choice article: Canonical locality preserving Latent Variable Model for discriminative pose inference

Image and Vision Computing
Mixtures of Gaussian process models for human pose estimation

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an algorithm for jointly learning a consistent bidirectional generative-recognition model that combines top-down and bottom-up processing for monocular 3d human motion reconstruction. Learning progresses in alternative stages of self-training that optimize the probability of the image evidence: the recognition model is tunned using samples from the generative model and the generative model is optimized to produce inferences close to the ones predicted by the current recognition model. At equilibrium, the two models are consistent. During on-line inference, we scan the image at multiple locations and predict 3d human poses using the recognition model. But this implicitly includes one-shot generative consistency feedback. The framework provides a uniform treatment of human detection, 3d initialization and 3d recovery from transient failure. Our experimental results show that this procedure is promising for the automatic reconstruction of human motion in more natural scene settings with background clutter and occlusion.