Realizing the performance potential of the virtual interface architecture
ICS '99 Proceedings of the 13th international conference on Supercomputing
Optimizing threaded MPI execution on SMP clusters
ICS '01 Proceedings of the 15th international conference on Supercomputing
The Virtual Interface Architecture
IEEE Micro
CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
High performance RDMA-based MPI implementation over InfiniBand
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Proceedings of the 23rd international conference on Supercomputing
Subdomain communication to increase scalability in large-scale scientific applications
Proceedings of the 23rd international conference on Supercomputing
Delta Send-Recv for Dynamic Pipelining in MPI Programs
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Hi-index | 0.00 |
Large message latencies often lead to poor performance of parallel applications. In this paper, we investigate a latency-tolerating technique that immediately releases all blocking receives, even when the message has not yet (completely) arrived, and enforces execution correctness through page protection. This approach eliminates false message data dependencies on incoming messages and allows the computation to proceed as early as possible. We implement and evaluate our early-release technique in the context of an MPI runtime library. The results show that the execution speed of MPI applications improves by up to 60% when early release is enabled. Our approach also enables faster and easier parallel programming as it frees programmers from adopting more complex nonblocking receives and from tuning message sizes to explicitly reduce false message data dependencies.