Monocular 3D scene modeling and inference: understanding multi-object traffic scenes

  • Authors:
  • Christian Wojek;Stefan Roth;Konrad Schindler;Bernt Schiele

  • Affiliations:
  • Computer Science Department, TU Darmstadt and MPI Informatics, Saarbrücken;Computer Science Department, TU Darmstadt;Computer Science Department, TU Darmstadt and Photogrammetry and Remote Sensing Group, ETH Zürich;Computer Science Department, TU Darmstadt and MPI Informatics, Saarbrücken

  • Venue:
  • ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In this paper, we present a novel probabilistic 3D scene model that encompasses multi-class object detection, object tracking, scene labeling, and 3D geometric relations. This integrated 3D model is able to represent complex interactions like inter-object occlusion, physical exclusion between objects, and geometric context. Inference allows to recover 3D scene context and perform 3D multiobject tracking from a mobile observer, for objects of multiple categories, using only monocular video as input. In particular, we show that a joint scene track-let model for the evidence collected over multiple frames substantially improves performance. The approach is evaluated for two different types of challenging on-board sequences. We first show a substantial improvement to the state-of-the-art in 3D multi-people tracking. Moreover, a similar performance gain is achieved for multi-class 3D tracking of cars and trucks on a new, challenging dataset.