High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters
Proceedings of the 21st annual international conference on Supercomputing
MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Unifying UPC and MPI runtimes: experience with MVAPICH
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
WMTools - assessing parallel application memory utilisation at scale
EPEW'11 Proceedings of the 8th European conference on Computer Performance Engineering
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Clusters in the area of high-performance computing have been growing in size at a considerable rate. In these clusters, the dominate programming model is the Message Passing Interface (MPI), so the MPI library has a key role in resource usage and performance. To obtain maximal performance, many clusters deploy a high-speed interconnect between compute nodes. One such interconnect, InfiniBand, has been gaining in popularity due to its various features including Remote Data Memory Access (RDMA), and high-performance. As a result, it is being deployed in a significant number of clusters and has been chosen as the standard interconnect for capacity clusters within the DOE Tri-Labs. As these clusters grow in size, care must be taken to ensure the resource usage does not increase too significantly with scale. In particular, the MPI library resource usage should not grow at a rate which will exhaust the node memory or starve user applications.