Using SCTP to hide latency in MPI programs

  • Authors:
  • H. Kamal;B. Penoff;M. Tsai;E. Vong;A. Wagner

  • Affiliations:
  • University of British Columbia, Dept. of Computer Science, Vancouver, BC, Canada;University of British Columbia, Dept. of Computer Science, Vancouver, BC, Canada;University of British Columbia, Dept. of Computer Science, Vancouver, BC, Canada;University of British Columbia, Dept. of Computer Science, Vancouver, BC, Canada;University of British Columbia, Dept. of Computer Science, Vancouver, BC, Canada

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A difficulty in using heterogeneous collections of geographically distributed machines across wide area networks for parallel computing is the huge variability in message latency that is orders of magnitude larger than parallel programs executing on dedicated systems. This variability is in part due to the underlying network bandwidth and latency which can vary dramatically according to network conditions. Although such an environment is not suitable for many message passing programs there are those programs that can take advantage of it. Using SCTP (Stream Control Transmission Protocol) for MPI, we show how to reduce the effect of latency on task farm programs to allow them to effectively execute in high latency environments. SCTP is a recently standardized transport level protocol that has a number of features that make it well-suited to MPI and our goal is to reduce the effect of latency on MPI programs in wide area networks. We take advantage of SCTP's improved congestion control as well as its ability to have multiple independent message streams over a single connection to eliminate the head of line blocking that can occur in TCP-based middleware. The use of streams required a novel use of MPI tags to identify independent streams rather than different types of messages. We describe the design of a task farm template that exploits streams, uses buffering and pipelining of task requests to improve its performance under network loss and variable latency. We use these techniques to improve the performance of two real-world MPI programs: a robust correlation matrix computation and mpiBLAST.