BSPlib: The BSP programming library
Parallel Computing
Portable and Efficient Parallel Computing Using the BSP Model
IEEE Transactions on Computers
The implementation of MPI-2 one-sided communication for the NEC SX-5
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
GASNet Specification, v1.1
High performance MPI-2 one-sided communication over InfiniBand
CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Natively Supporting True One-Sided Communication in MPI on Multi-core Systems with InfiniBand
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Self-Consistent MPI Performance Guidelines
IEEE Transactions on Parallel and Distributed Systems
An evaluation of implementation options for MPI one-sided communication
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Analysis of implementation options for MPI-2 one-sided
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Revealing the performance of MPI RMA implementations
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Enabling highly-scalable remote memory access programming with MPI-3 one sided
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
The one-sided communication model supported by MPI-2 can be more convenient to use than the regular two-sided communication model and has potential to provide better performance. The MPI-2 standard gives flexibility about when RMA operations can be issued and completed. The current MPICH2 implementation employs a lazy approach, in which operations are queued up and issued in the later synchronization phase. This has certain benefits for small data transfers because of reduced network operations, but for large data transfers, issuing operations in an eager fashion could achieve better performance. In this paper we describe our design and implementation of an adaptive strategy for one-sided operations and synchronization mechanisms (fence, post-start-complete-wait, lock-unlock) supported by MPI-2, which combines benefits from both lazy and eager approaches. Our performance results demonstrate that our approach performs as well as the lazy approach for small data transfers and achieves similar performance as the eager approach for large data transfers. In addition, it achieves good overlap of communication with computation.