Efficient evaluation of nearest-neighbor queries in content-addressable networks

  • Authors:
  • Erik Buchmann;Klemens Böhm

  • Affiliations:
  • University of Magdeburg, Germany;University of Karlsruhe, Germany

  • Venue:
  • From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Content-Addressable Networks (CAN) are able to manage huge sets of (key,value)-pairs and cope with very high workloads. They follow the peer-to-peer (P2P) paradigm in order to build scalable, distributed data structures on top of the Internet. CAN are designed to drive Internet-scale applications like distributed search engines, multimedia retrieval systems and more. In these scenarios, the nearest-neighbor (NN) query model is very natural: the user specifies a query key, and the engine responds with the set of query results closest to the key. Implementing NN queries in CAN is challenging. As with any P2P system, global knowledge about the peers responsible for parts of the query result is not available, and the communication overhead is the most critical factor. In this paper, we present our approach to realize efficient NN queries in CAN. We evaluate our NN query processing scheme by experiments with a CAN implementation in a setting derived from web applications. The results of our experiments with 10.000 peers are positive: even large result sets with a precision of 75% can be obtained by invoking less than 1.6 peers on average. In addition, our NN protocol is suitable for prefetching in settings with sequences of consecutive queries for similar keys.