Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera

  • Authors:
  • Tomokazu Sato;Masayuki Kanbara;Naokazu Yokoya;Haruo Takemura

  • Affiliations:
  • Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0101, Japan. tomoka-s@is.aist-nara.ac.jp;Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0101, Japan. masay-ka@is.aist-nara.ac.jp;Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0101, Japan. yokoya@is.aist-nara.ac.jp;Cybermedia Center, Osaka University, 1-30 Machikaneyama, Toyonaka, Osaka, 560-0048, Japan. takemura@cmc.osaka-u.ac.jp

  • Venue:
  • International Journal of Computer Vision
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Three-dimensional (3-D) models of outdoor scenes are widely used for object recognition, navigation, mixed reality, and so on. Because such models are often made manually with high costs, automatic 3-D reconstruction has been widely investigated. In related work, a dense 3-D model is generated by using a stereo method. However, such approaches cannot use several hundreds images together for dense depth estimation because it is difficult to accurately calibrate a large number of cameras. In this paper, we propose a dense 3-D reconstruction method that first estimates extrinsic camera parameters of a hand-held video camera, and then reconstructs a dense 3-D model of a scene. In the first process, extrinsic camera parameters are estimated by tracking a small number of predefined markers of known 3-D positions and natural features automatically. Then, several hundreds dense depth maps obtained by multi-baseline stereo are combined together in a voxel space. So, we can acquire a dense 3-D model of the outdoor scene accurately by using several hundreds input images captured by a hand-held video camera.