Software Techniques for Improving MPP Bulk-Transfer Performance

  • Authors:
  • Eric A. Brewer;Paul Gauthier;Armando Fox;Angela Schuett

  • Affiliations:
  • -;-;-;-

  • Venue:
  • IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Brewer & Kuszmaul (1994) demonstrated how barriers and traffic interleaving can alleviate the problem of bulk-transfer performance degradation on the Thinking Machines CM-5 massively parallel processor (MPP) by exploiting the observation that one-on-one communication avoids network congestion. We apply and extend these techniques on the Intel Paragon and MIT Alewife machines. Because these machines lack the CM-5's fast hardware support for barriers, we introduce a token-passing scheme that avoids barriers while maintaining one-on-one communication. We also introduce a new algorithm-distributed dynamic scheduling-that brings Brewer & Kuszmaul's observations to bear on irregular traffic patterns by massaging traffic into a sequence of near-permutations at runtime, without requiring any preprocessing or global state. The measured performance of our algorithm exceeds that of traffic interleaving (the most effective technique proposed by Brewer & Kuszmaul) on all three platforms, and is comparable to the performance of static scheduling, which requires preprocessing and global state.