LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Programming with POSIX threads
Programming with POSIX threads
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
The Impact of MPI Queue Usage on Message Latency
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
PathScale InfiniPath: A First Look
HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Hi-index | 0.00 |
We consider parallel applications that use the MPI programming interface for inter-process communication and determine the processor communication overhead for high performance computing clusters that are built with high-speed interconnect networks such as Pathscale InfiniPath and that support either the open source Open MPI implementation or the Pathscale MPI implementation, or both. We show that, for large messages, the processor overhead is large for both MPI implementations and for both network interconnects. Then we develop a technique, based on using multi-threading in the MPI application, for overcoming the processor communication overhead. We demonstrate that our technique dramatically reduces the impact of the processor overhead.