Scene reconstruction and visualization from internet photo collections

  • Authors:
  • Steve Seitz;Rick Szeliski;Keith N. Snavely

  • Affiliations:
  • University of Washington;University of Washington;University of Washington

  • Venue:
  • Scene reconstruction and visualization from internet photo collections
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Internet is becoming an unprecedented source of visual information, with billions of images instantly accessible through image search engines such as Google Images and Flickr. These include thousands of photographs of virtually every famous place, taken from a multitude of viewpoints, at many different times of day, and under a variety of weather conditions. This thesis addresses the problem of leveraging such photos to create new 3D interfaces for virtually exploring our world. One key challenge is that recreating 3D scenes from photo collections requires knowing where each photo was taken. This thesis introduces new computer vision techniques that robustly recover such information from photo collections without requiring GPS or other instrumentation. These methods are the first to be demonstrated on Internet imagery, and show that 3D reconstruction techniques can be successfully applied to this rich, largely untapped resource. For this problem scale is a particular concern, as Internet collections can be extremely large. I introduce an efficient reconstruction algorithm that selects a small skeletal set of images as a preprocess. This approach can reduce reconstruction time by an order of magnitude with little or no loss in completeness or accuracy. A second challenge is to build interfaces that take these reconstructions and provide effective scene visualizations. Towards this end, I describe two new 3D user interfaces. Photo Tourism is a 3D photo browser with new geometric controls for moving between photos. These include zooming in to find details, zooming out for more context, and selecting an image region to find photos of an object. The second interface, Pathfinder, takes advantage of the fact that people tend to take photos of interesting views and along interesting paths. Pathfinder creates navigation controls tailored to each location by analyzing the distribution of photos to discover such characteristic views and paths. These controls make it easy to find and explore the important parts of each scene. Together these techniques enable the automatic creation of 3D experiences for famous sites. A user simply enters relevant keywords and the system automatically downloads images, reconstructs the site, derives navigation controls, and provides an immersive interface.