Reducing the bandwidth requirements of p2p keyword indexing

  • Authors:
  • John Casey;Wanlei Zhou

  • Affiliations:
  • School of Information Technology, Deakin University, Burwood, VIC, Australia;School of Information Technology, Deakin University, Burwood, VIC, Australia

  • Venue:
  • ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the design and evaluation of a federated, peer-to-peer indexing system, which can be used to integrate the resources of local systems into a globally addressable index using a distributed hash table. The salient feature of the indexing systems design is the efficient dissemination of term-document indices using a combination of duplicate elimination, leaf set forwarding and conventional techniques such as aggressive index pruning, index compression, and batching. Together these indexing strategies help to reduce the number of RPC operations required to locate the nodes responsible for a section of the index, as well as the bandwidth utilization and the latency of the indexing service. Using empirical observation we evaluate the performance benefits of these cumulative optimizations and show that these design trade-offs can significantly improve indexing performance when using a distributed hash table.