Hyperspaces for object clustering and approximate matching in peer-to-peer overlays

  • Authors:
  • Bernard Wong;Ymir Vigfússon;Emin Gün Sirer

  • Affiliations:
  • Dept. of Computer Science, Cornell University, Ithaca, NY;Dept. of Computer Science, Cornell University, Ithaca, NY;Dept. of Computer Science, Cornell University, Ithaca, NY

  • Venue:
  • HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing distributed hash tables provide efficient mechanisms for storing and retrieving a data item based on an exact key, but are unsuitable when the search key is similar, but not identical, to the key used to store the data item. In this paper, we present a scalable and efficient peer-to-peer system with a new search primitive that can efficiently find the k data items with keys closest to the search key. The system works via a novel assignment of virtual coordinates to each object in a high-dimensional, synthetic space such that the proximity between two points in the coordinate space is correlated with the similarity between the strings that the points represent. We examine the feasibility of this approach for efficient, peer-to-peer search on inexact string keys, and show that the system provides a robust method to handle key perturbations that naturally occur in applications, such as file-sharing networks, where the query strings are provided by users.