Peer-to-peer similarity search in metric spaces

  • Authors:
  • Christos Doulkeridis;Akrivi Vlachou;Yannis Kotidis;Michalis Vazirgiannis

  • Affiliations:
  • Athens University of Economics and Business, Greece;Athens University of Economics and Business, Greece;Athens University of Economics and Business, Greece;Athens University of Economics and Business, Greece and INRIA/FUTURS, France

  • Venue:
  • VLDB '07 Proceedings of the 33rd international conference on Very large data bases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the efficient processing of similarity queries in metric spaces, where data is horizontally distributed across a P2P network. The proposed approach does not rely on arbitrary data movement, hence each peer joining the network autonomously stores its own data. We present SIMPEER, a novel framework that dynamically clusters peer data, in order to build distributed routing information at super-peer level. SIMPEER allows the evaluation of range and nearest neighbor queries in a distributed manner that reduces communication cost, network latency, bandwidth consumption and computational overhead at each individual peer. SIMPEER utilizes a set of distributed statistics and guarantees that all similar objects to the query are retrieved, without necessarily flooding the network during query processing. The statistics are employed for estimating an adequate query radius for k-nearest neighbor queries, and transform the query to a range query. Our experimental evaluation employs both real-world and synthetic data collections, and our results show that SIMPEER performs efficiently, even in the case of high degree of distribution.