Scaling All-to-All Multicast on Fat-tree Networks

  • Authors:
  • Sameer Kumar;Laxmikant V. Kale

  • Affiliations:
  • University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign

  • Venue:
  • ICPADS '04 Proceedings of the Parallel and Distributed Systems, Tenth International Conference
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the all-to-all multicast operation.Strategies for all-to-all multicast need to be different forsmall and large messages. For small messages, the majorissue is the minimization of software overhead, where asfor large messages, the issue is network contention. Manymodern large parallel computers use the fat-tree interconnectiontopology. We therefore analyze network contentionon fat-tree networks and develop strategies to optimize collectivemulticast using known contention free communicationschedules on fat-tree networks in the design of twonovel strategies. We evaluate performance of these strategieswith up to 256 nodes (1024 processors) on an alphacluster. We present schemes that perform well when a contiguouschunk of nodes is not available. For large messages,many of our strategies have two times better throughputthan native MPI. We also demonstrate that the softwareoverhead of a collective operation is a small fraction of thetotal completion time in the presence of the communicationco-processor. We therefore compare the performance of thestudied strategies using both metrics (i) Completion time,and (ii) Computation overhead.