Designing Efficient Distributed Algorithms Using Sampling Techniques

  • Authors:
  • Sanguthevar Rajasekaran;David S. L. Wei

  • Affiliations:
  • -;-

  • Venue:
  • IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we show the power of sampling techniques in designing efficient distributed algorithms. In particular, we show that using sampling techniques, on some networks, selection can be done in such a way that the message complexity is independent of the cardinality of the set (file), provided the file size is polynomial in the network size. For example, given a file F of size n and an integer k(1\leq k\leq n), on a p-processor de Bruijn network, our deterministic selection algorithm can find the kth smallest key from F using O(p\log^{3}p) messages and with a communication delay of O(\log^{3}p), and that our randomized selection algorithm can finish the same task using only O(p) messages and a communication delay of O(\log p) with high probability, provided the file size is polynomial in network size. Our randomized selection outperforms the existing approaches in terms of both message complexity and communication delay.The property that the number of messages needed and the communication delay are independent of the size of the file makes our distributed selection schemes extremely attractive in such domains as very large database systems. Making use of our selection algorithms to select pivot element(s), we also develop a near optimal quicksort-based sorting scheme and a nearly optimal enumeration sorting scheme for sorting large distributed files on the hypercube and de Bruijn networks. Our algorithms are fully distributed without any a priori central control.