Discriminative learning with latent variables for cluttered indoor scene understanding

Authors:
Huayan Wang;Stephen Gould;Daphne Roller
Affiliations:
Stanford University;Australian National University;Stanford University
Venue:
Communications of the ACM
Year:
2013

Citing 13
Cited 0

Mean Shift: A Robust Approach Toward Feature Space Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Recovering Surface Layout from an Image

International Journal of Computer Vision
Learning Spatial Context: Using Stuff to Find Things

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context

International Journal of Computer Vision
Learning structural SVMs with latent variables

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Cutting-plane training of structural SVMs

Machine Learning
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discriminative learning with latent variables for cluttered indoor scene understanding

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Thinking inside the box: using appearance models and context based on room geometry

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
From 3D scene geometry to human workspace

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Teaching 3D geometry to deformable part models

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Real-time indoor scene understanding using Bayesian filtering with motion cues

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	48.22

Visualization

Abstract

We address the problem of understanding an indoor scene from a single image in terms of recovering the room geometry (floor, ceiling, and walls) and furniture layout. A major challenge of this task arises from the fact that most indoor scenes are cluttered by furniture and decorations, whose appearances vary drastically across scenes, thus can hardly be modeled (or even hand-labeled) consistently. In this paper we tackle this problem by introducing latent variables to account for clutter, so that the observed image is jointly explained by the room and clutter layout. Model parameters are learned from a training set of images that are only labeled with the layout of the room geometry. Our approach enables taking into account and inferring indoor clutter without hand-labeling of the clutter in the training set, which is often inaccurate. Yet it outperforms the state-of-the-art method of Hedau et al. that requires clutter labels. As a latent variable based method, our approach has an interesting feature that latent variables are used in direct correspondence with a concrete visual concept (clutter in the room) and thus interpretable.