Learning Generative Models for Multi-Activity Body Pose Estimation

Authors:
Tobias Jaeggli;Esther Koller-Meier;Luc Gool
Affiliations:
ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland and KU Leuven, Leuven, Belgium
Venue:
International Journal of Computer Vision
Year:
2009

Citing 23
Cited 12

CONDENSATION—Conditional Density Propagation forVisual Tracking

International Journal of Computer Vision
On sequential Monte Carlo sampling methods for Bayesian filtering

Statistics and Computing
A Mixed-State Condensation Tracker with Automatic Model-Switching

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Inferring 3D Structure with a Statistical Image-Based Shape Model

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Generative modeling for continuous non-linearly embedded visual inference

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Discriminative Density Propagation for 3D Human Motion Estimation

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Monocular Human Motion Capture with a Mixture of Regressors

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops - Volume 03
Dynamic Appearance Modeling for Human Tracking

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
3D People Tracking with Gaussian Process Dynamical Models

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Transformation invariant component analysis for binary images

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Computational studies of human motion: part 1, tracking and motion synthesis

Foundations and Trends® in Computer Graphics and Vision
Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models

The Journal of Machine Learning Research
A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Monte Carlo filtering and smoothing with application to time-varying spectral estimation

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Inferring 3D body pose from silhouettes using activity manifold learning

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
3D human pose from silhouettes by relevance vector regression

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Nonparametric belief propagation

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
PAMPAS: real-valued graphical models for computer vision

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Monocular tracking with a mixture of view-dependent learned models

AMDO'06 Proceedings of the 4th international conference on Articulated Motion and Deformable Objects
An efficient euclidean distance transform

IWCIA'04 Proceedings of the 10th international conference on Combinatorial Image Analysis
Multivariate relevance vector machines for tracking

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part III
Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory

Multiple-activity human body tracking in unconstrained environments

AMDO'10 Proceedings of the 6th international conference on Articulated motion and deformable objects
Latent gaussian mixture regression for human pose estimation

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Recognizing multiple human activities and tracking full-body pose in unconstrained environments

Pattern Recognition
Learning gestures for customizable human-computer interaction in the operating room

MICCAI'11 Proceedings of the 14th international conference on Medical image computing and computer-assisted intervention - Volume Part I
Estimating human pose from occluded images

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part I
Human skeleton tracking from depth data using geodesic distances and optical flow

Image and Vision Computing
Fast Human Pose Detection Using Randomized Hierarchical Cascades of Rejectors

International Journal of Computer Vision
Coupled Action Recognition and Pose Estimation from Multiple Views

International Journal of Computer Vision
Multimodal behavioral analysis for non-invasive stress detection

Expert Systems with Applications: An International Journal
Editor's choice article: Canonical locality preserving Latent Variable Model for discriminative pose inference

Image and Vision Computing
Object joint detection and tracking using adaptive multiple motion models

The Visual Computer: International Journal of Computer Graphics
Exploiting projective geometry for view-invariant monocular human motion analysis in man-made environments

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method to simultaneously estimate 3D body pose and action categories from monocular video sequences. Our approach learns a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Body poses are modelled on a low-dimensional manifold obtained by Locally Linear Embedding dimensionality reduction. In addition, we learn a prior model of likely body poses and a dynamical model in this pose manifold. Sparse kernel regressors capture the nonlinearities of this mapping efficiently. Within a Recursive Bayesian Sampling framework, the potentially multimodal posterior probability distributions can then be inferred. An activity-switching mechanism based on learned transfer functions allows for inference of the performed activity class, along with the estimation of body pose and 2D image location of the subject. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.