Clustering Billions of Images with Large Scale Nearest Neighbor Search

  • Authors:
  • Ting Liu;Charles Rosenberg;Henry A. Rowley

  • Affiliations:
  • Google Inc., Mountain View, CA, USA;Google Inc., Mountain View, CA, USA;Google Inc., Mountain View, CA, USA

  • Venue:
  • WACV '07 Proceedings of the Eighth IEEE Workshop on Applications of Computer Vision
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The proliferation of the web and digital photography have made large scale image collections containing billions of images a reality. Image collections on this scale make performing even the most common and simple computer vision, image processing, and machine learning tasks non-trivial. An example is nearest neighbor search, which not only serves as a fundamental subproblem in many more sophisticated algorithms, but also has direct applications, such as image retrieval and image clustering. In this paper, we address the nearest neighbor problem as the first step towards scalable image processing. We describe a scalable version of an approximate nearest neighbor search algorithm and discuss how it can be used to find near duplicates among over a billion images.