A portable runtime interface for multi-level memory hierarchies
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Adapting a message-driven parallel application to GPU-accelerated clusters
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Architecture of the Component Collective Messaging Interface
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Impact of Node Level Caching in MPI Job Launch Mechanisms
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Architecture of the Component Collective Messaging Interface
International Journal of High Performance Computing Applications
ScELA: scalable and extensible launching architecture for clusters
HiPC'08 Proceedings of the 15th international conference on High performance computing
PMI: a scalable parallel process-management interface for extreme-scale systems
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Scalable memory registration for high performance networks using helper threads
Proceedings of the 8th ACM International Conference on Computing Frontiers
Improving MPI applications performance on multicore clusters with rank reordering
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
CUDASA: compute unified device and systems architecture
EG PGV'08 Proceedings of the 8th Eurographics conference on Parallel Graphics and Visualization
Analysis of implementation options for MPI-2 one-sided
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Distributed ONE: scalable parallel network simulation
Proceedings of the 6th International ICST Conference on Simulation Tools and Techniques
Hi-index | 0.00 |
MPICH2 provides a layered architecture for implementing MPI-2. In this paper, we provide a new design for implementing MPI-2 over InfiniBand by extending the MPICH2 ADI3 layer. Our new design aims to achieve high performance by providing a multi-communication method framework that can utilize appropriate communication channels/devices to attain optimal performance without compromising on scalability and portability. We also present the performance comparison of the new design with our previous design based on the MPICH2 RDMA channel. We show significant performance improvements in micro-benchmarks and NAS Parallel Benchmarks.