Spacetime stereo and its applications

  • Authors:
  • Li Zhang;Steven M. Seitz

  • Affiliations:
  • University of Washington;University of Washington

  • Venue:
  • Spacetime stereo and its applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00



A long-standing challenge in computer vision is to recover 3D shapes from images, especially when the shapes change over time. Among the numerous existing shape acquisition methods, very few are able to accurately measure time-varying shapes at both high spatial and temporal resolution. Furthermore, some previous methods have strict assumptions on surface color and texture and do not work robustly for complex object shapes. In this dissertation, we address these limitations and present a new approach, spacetime stereo, for dynamic shape acquisition. Spacetime stereo formulates stereo matching as a 3D window warping problem in video volumes. This new formulation incorporates the temporal variations of image pixels as a cue for better shape reconstruction, and is a general framework for different variants of triangulation-based shape reconstruction methods, namely, passive stereo, active stereo, one-shot structured light, and multi-shot structured light. Under this framework, we develop algorithms that accurately reconstruct deforming objects at high spatial and temporal resolution. Spacetime stereo is also effective for a class of natural scenes, such as waving trees and flowing water, which have repetitive textures and chaotic behaviors and are challenging for existing stereo algorithms. To demonstrate an application of spacetime stereo, we build a dynamic face capture system that consists of off-the-shelf cameras, projectors, and computers. We also propose a new template fitting and tracking method that computes a sequence of 3D face meshes with vertex-to-vertex correspondence by combining shape and optical flow estimation. The resulting system is able to continuously (20Hz) capture high resolution (about 20K points) facial motion simultaneously from multiple viewpoints without the need to paint markers on the face. Finally, we develop data-driven facial animation tools that take advantage of the captured 3D face sequence. These tools enable untrained users to produce realistic facial expressions and animations.