Contextually guided semantic labeling and search for three-dimensional point clouds

  • Authors:
  • Abhishek Anand;Hema Swetha Koppula;Thorsten Joachims;Ashutosh Saxena

  • Affiliations:
  • Department of Computer Science, Cornell University, USA;Department of Computer Science, Cornell University, USA;Department of Computer Science, Cornell University, USA;Department of Computer Science, Cornell University, USA

  • Venue:
  • International Journal of Robotics Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

RGB-D cameras, which give an RGB image together with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the three-dimensional (3D) point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurrence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments concerning a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.1