Stream control transmission protocol (SCTP): a reference guide
Stream control transmission protocol (SCTP): a reference guide
Network performance-aware collective communication for clustered wide-area systems
Parallel Computing - Clusters and computational grids for scientific computing
Parallel computation of high dimensional robust correlation and covariance matrices
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Building and Using a Fault-Tolerant MPI Implementation
International Journal of High Performance Computing Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Towards MPI progression layer elimination with TCP and SCTP
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
A difficulty in using heterogeneous collections of geographically distributed machines across wide area networks for parallel computing is the huge variability in message latency that is orders of magnitude larger than parallel programs executing on dedicated systems. This variability is in part due to the underlying network bandwidth and latency which can vary dramatically according to network conditions. Although such an environment is not suitable for many message passing programs there are those programs that can take advantage of it. Using SCTP (Stream Control Transmission Protocol) for MPI, we show how to reduce the effect of latency on task farm programs to allow them to effectively execute in high latency environments. SCTP is a recently standardized transport level protocol that has a number of features that make it well-suited to MPI and our goal is to reduce the effect of latency on MPI programs in wide area networks. We take advantage of SCTP's improved congestion control as well as its ability to have multiple independent message streams over a single connection to eliminate the head of line blocking that can occur in TCP-based middleware. The use of streams required a novel use of MPI tags to identify independent streams rather than different types of messages. We describe the design of a task farm template that exploits streams, uses buffering and pipelining of task requests to improve its performance under network loss and variable latency. We use these techniques to improve the performance of two real-world MPI programs: a robust correlation matrix computation and mpiBLAST.