IBM Journal of Research and Development
Scalability of communicators and groups in MPI
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A scalable MPI_Comm_split algorithm for exascale computing
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Compact and efficient implementation of the MPI group operations
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Exascale algorithms for generalized MPI_comm_split
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Mapping Dense LU Factorization on Multicore Supercomputer Nodes
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Hi-index | 0.00 |
Current implementations of process groups (subcommunicators) have non-scalable (O(group size)) memory footprints and even worse time complexities for setting up communication. We propose system-ranked process groups, where member ranks are picked by the runtime system, as a cheaper and faster alternative for a subset of collective operations (barrier, broadcast, reduction, allreduce). This paper presents two distributed algorithms for balanced, k-ary spanning tree construction over system-ranked process groups obtained by splitting a parent group. Our schemes have much smaller memory footprints and also perform better, even at modest process counts. We demonstrate performance results up to 131,072 cores of BlueGene/P.