Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Fault Tolerance in Message Passing Interface Programs
International Journal of High Performance Computing Applications
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
Integrated Data and Task Management for Scientific Applications
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Efficient multithreaded context ID allocation in MPI
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Dataflow coordination of data-parallel tasks via MPI 3.0
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.00 |
MPI communicators abstract communication operations across application modules, facilitating seamless composition of different libraries. In addition, communicators provide the ability to form groups of processes and establish multiple levels of parallelism. Traditionally, communicators have been collectively created in the context of the parent communicator. The recent thrust toward systems at petascale and beyond has brought forth new application use cases, including fault tolerance and load balancing, that highlight the ability to construct an MPI communicator in the context of its new process group as a key capability. However, it has long been believed that MPI is not capable of allowing the user to form a new communicator in this way. We present a new algorithm that allows the user to create such flexible process groups using only the functionality given in the current MPI standard. We explore performance implications of this technique and demonstrate its utility for load balancing in the context of a Markov chain Monte Carlo computation. In comparison with a traditional collective approach, noncollective communicator creation enables a 30% improvement in execution time through asynchronous load balancing.