Interactive images: cuboid proxies for smart image manipulation
ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings
People watching: human actions as a cue for single view geometry
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Indoor segmentation and support inference from RGBD images
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Scene semantics from long-term observation of people
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Efficient exact inference for 3d indoor scene understanding
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Human-centric indoor environment modeling from depth videos
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
International Journal of Computer Vision
Discriminative learning with latent variables for cluttered indoor scene understanding
Communications of the ACM
ACM Transactions on Graphics (TOG) - SIGGRAPH 2013 Conference Proceedings
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Exploiting depth camera for 3D spatial relationship interpretation
Proceedings of the 4th ACM Multimedia Systems Conference
Hi-index | 0.02 |
We present a human-centric paradigm for scene understanding. Our approach goes beyond estimating 3D scene geometry and predicts the "workspace" of a human which is represented by a data-driven vocabulary of human interactions. Our method builds upon the recent work in indoor scene understanding and the availability of motion capture data to create a joint space of human poses and scene geometry by modeling the physical interactions between the two. This joint space can then be used to predict potential human poses and joint locations from a single image. In a way, this work revisits the principle of Gibsonian affordances, reinterpreting it for the modern, data-driven era.