Architectural requirements and scalability of the NAS parallel benchmarks
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-Passing Communication Subsystem
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Data Transfers between Processes in an SMP System: Performance Study and Application to MPI
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Feasibility study of MPI implementation on the heterogeneous multi-core cell BE™ architecture
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Implementing MPI-IO Atomic Mode and Shared File Pointers Using MPI One-Sided Communication
International Journal of High Performance Computing Applications
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Buffered-Mode MPI Implementation for the Cell BETM Processor
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Non-data-communication Overheads in MPI: Analysis on Blue Gene/P
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A Prototype Implementation of MPI for SMARTMAP
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Efficient Shared Memory Message Passing for Inter-VM Communications
Euro-Par 2008 Workshops - Parallel Processing
The Importance of Non-Data-Communication Overheads in MPI
International Journal of High Performance Computing Applications
Exploiting Direct Access Shared Memory for MPI On Multi-Core Processors
International Journal of High Performance Computing Applications
Open issues in MPI implementation
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
A synchronous mode MPI implementation on the cell BETM architecture
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Redesigning MPI shared memory communication for large multi-core architecture
Computer Science - Research and Development
Globalizing selectively: shared-memory efficiency with address-space separation
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
This paper presents the implementation of MPICH2 over the Nemesis communication subsystem and the evaluation of its shared-memory performance. We describe design issues as well as some of the optimization techniques we employed. We conducted a performance evaluation over shared memory using microbenchmarks as well as application benchmarks. The evaluation shows that MPICH2 Nemesis has very low communication overhead, making it suitable for smaller-grained applications.