Fast MPI Broadcasts through Reliable Multicasting

  • Authors:
  • Paul Sack;Anne C. Elster

  • Affiliations:
  • -;-

  • Venue:
  • PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

When running parallel programs on clusters of individual computers or workstations, network communication is often the performance bottleneck. Since the round-trip time for a network packet is orders of magnitude larger than the amount of time it takes for an equivalent amount of data to be transferred from memory, methods which reduce network usage can result in significant performance improvements for parallel programs.This work demonstrates that broadcast performance can be improved by a significant factor using a portable reliable multicasting protocol compared to unicasting, which is typically used. Our end-product is an MPICH patch that does not require kernel modification. It is therefore portable to any UNIX-based system. MPICH is a popular, portable MPI implementation provided by Argonne National Laboratories (ANL). Since absolute reliability is critical for data integrity when broadcasting messages on clusters, our multicasting protocol also addresses reliability issues.