Multigrid
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
MPI/RT --- An Emerging Standard for High-Performance Real-Time Systems
HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences - Volume 3
A Framework for Collective Personalized Communication
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Send-receive considered harmful: Myths and realities of message passing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Implementation and performance analysis of non-blocking collective operations for MPI
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Improving the Performance of Multiple Conjugate Gradient Solvers by Exploiting Overlap
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A case for non-blocking collective operations
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
A case for standard non-blocking collective operations
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
This paper presents a case study about the applicability and usage of non-blocking collective operations. These operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. We introduce our NBC library, a portable low-overhead implementation of non-blocking collectives on top of MPI-1. We demonstrate the easy usage of the NBC library with the optimization of a conjugate gradient solver with only minor changes to the traditional parallel implementation of the program. The optimized solver runs up to 34% faster and is able to overlap most of the communication. We show that there is, due to the overlap, no performance difference between Gigabit Ethernet and InfiniBandTM for our calculation.