From 3D scene geometry to human workspace

Authors:
A. Gupta;S. Satkin;A. A. Efros;M. Hebert
Affiliations:
-;-;-;-
Venue:
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2011

Citing 0
Cited 11

Interactive images: cuboid proxies for smart image manipulation

ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings
People watching: human actions as a cue for single view geometry

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Indoor segmentation and support inference from RGBD images

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Scene semantics from long-term observation of people

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Efficient exact inference for 3d indoor scene understanding

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Human-centric indoor environment modeling from depth videos

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
Superparsing

International Journal of Computer Vision
Discriminative learning with latent variables for cluttered indoor scene understanding

Communications of the ACM
Interpreting concept sketches

ACM Transactions on Graphics (TOG) - SIGGRAPH 2013 Conference Proceedings
AfNet: the affordance network

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Exploiting depth camera for 3D spatial relationship interpretation

Proceedings of the 4th ACM Multimedia Systems Conference

Quantified Score

Hi-index	0.02

Visualization

Abstract

We present a human-centric paradigm for scene understanding. Our approach goes beyond estimating 3D scene geometry and predicts the "workspace" of a human which is represented by a data-driven vocabulary of human interactions. Our method builds upon the recent work in indoor scene understanding and the availability of motion capture data to create a joint space of human poses and scene geometry by modeling the physical interactions between the two. This joint space can then be used to predict potential human poses and joint locations from a single image. In a way, this work revisits the principle of Gibsonian affordances, reinterpreting it for the modern, data-driven era.