Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Single sided MPI implementations for SUN MPIr
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Implementing MPI's One-Sided Communications for WMPI
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An Evaluation of Two Implementation Strategies for Optimizing One-Sided Atomic Reduction
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
High performance MPI-2 one-sided communication over InfiniBand
CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Optimizing the HPCC randomaccess benchmark on blue Gene/L Supercomputer
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Hi-index | 0.00 |
MPI-2's One-sided communication interface is being explored in scientific applications. One of the important operations in a one sided model is read-modify- write. MPI-2 semantics provide MPI Put, MPI Get and MPI Accumulate operations which can be used to implement read-modify-write functionality. The different strategies yield varying performance benefits depending on the underlying one-sided implementation. We use HPCC Random Access benchmark which primarily uses read-modify-write operations as a case study for evaluating the different implementation strategies in this paper. Currently this benchmark is implemented based on MPI two-sided semantics. In this work we design and evaluate MPI-2 versions of the HPCC Random Access benchmark using one-sided operations. To improve the performance, we explore two different optimizations: (i) software based aggregation and (ii) hardware-based atomic operations. We evaluate our different approaches on an InfiniBand cluster. The software based aggregation outperforms the basic one sided scheme without aggregation by a factor of 4.38. The hardware based scheme shows an improvement by a factor of 2.62 as compared to the basic one sided scheme.