Performance capture of interacting characters with handheld kinects

Authors:
Genzhi Ye;Yebin Liu;Nils Hasler;Xiangyang Ji;Qionghai Dai;Christian Theobalt
Affiliations:
Department of Automation, Tsinghua Univercity, Beijing, China;Department of Automation, Tsinghua Univercity, Beijing, China;Max-Planck Institute for Informatics, Saarbrücken, Germany;Department of Automation, Tsinghua Univercity, Beijing, China;Department of Automation, Tsinghua Univercity, Beijing, China;Max-Planck Institute for Informatics, Saarbrücken, Germany
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Year:
2012

Citing 19
Cited 3

Articulated Soft Objects for Multiview Shape and Motion Capture

IEEE Transactions on Pattern Analysis and Machine Intelligence
Twist Based Acquisition and Tracking of Animal and Human Kinematics

International Journal of Computer Vision
Surface Capture for Performance-Based Animation

IEEE Computer Graphics and Applications
Vision-based human motion analysis: An overview

Computer Vision and Image Understanding
4-points congruent sets for robust pairwise surface registration

ACM SIGGRAPH 2008 papers
Articulated mesh animation from multi-view silhouettes

ACM SIGGRAPH 2008 papers
Performance capture from sparse multi-view video

ACM SIGGRAPH 2008 papers
Human Motion Tracking by Registering an Articulated Surface to 3D Points and Normals

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fusion of 2d and 3d sensor data for articulated body tracking

Robotics and Autonomous Systems
Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation

International Journal of Computer Vision
Kinematic self retargeting: A framework for human pose estimation

Computer Vision and Image Understanding
Real-time human pose recognition in parts from single depth images

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Markerless motion capture of interacting characters using multi-view image segmentation

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Accurate 3D pose estimation from a single depth image

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Fast articulated motion tracking using a sums of Gaussians body model

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
A data-driven approach for real-time full body pose reconstruction from a depth camera

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Efficient regression of general-activity human poses from depth images

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Home 3D body scans from noisy image and range data

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
GPU accelerated likelihoods for stereo-based articulated tracking

ECCV'10 Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part II

Estimating human motion from multiple Kinect sensors

Proceedings of the 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications
Alternative performance capturing

ACM SIGGRAPH 2013 Studio Talks
Special Section on CAD/Graphics 2013: SCAPE-based human performance reconstruction

Computers and Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an algorithm for marker-less performance capture of interacting humans using only three hand-held Kinect cameras. Our method reconstructs human skeletal poses, deforming surface geometry and camera poses for every time step of the depth video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Only the combination of geometric and photometric correspondences and the integration of human pose and camera pose estimation enables reliable performance capture with only three sensors. As opposed to previous performance capture methods, our algorithm succeeds on general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.