Joint spatio-temporal depth features fusion framework for 3d structure estimation in urban environment

  • Authors:
  • Mohamad Motasem Nawaf;Alain Trémeau

  • Affiliations:
  • Laboratoire Hubert Curien UMR CNRS 5516, Université Jean Monnet, Saint-Etienne, France;Laboratoire Hubert Curien UMR CNRS 5516, Université Jean Monnet, Saint-Etienne, France

  • Venue:
  • ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel approach to improve 3D structure estimation from an image stream in urban scenes. We consider a particular setup where the camera is installed on a moving vehicle. Applying traditional structure from motion (SfM) technique in this case generates poor estimation of the 3d structure due to several reasons such as texture-less images, small baseline variations and dominant forward camera motion. Our idea is to introduce the monocular depth cues that exist in a single image, and add time constraints on the estimated 3D structure. We assume that our scene is made up of small planar patches which are obtained using over-segmentation method, and our goal is to estimate the 3D positioning for each of these planes. We propose a fusion framework that employs Markov Random Field (MRF) model to integrate both spatial and temporal depth information. An advantage of our model is that it performs well even in the absence of some depth information. Spatial depth information is obtained through a global and local feature extraction method inspired by Saxena et al. [1]. Temporal depth information is obtained via sparse optical flow based structure from motion approach. That allows decreasing the estimation ambiguity by forcing some constraints on camera motion. Finally, we apply a fusion scheme to create unique 3D structure estimation.