Noncontiguous Processor Allocation Algorithms for Mesh-Connected Multicomputers
IEEE Transactions on Parallel and Distributed Systems
A comparison of next-fit, first-fit, and best-fit
Communications of the ACM
Job Scheduling for the BlueGene/L System
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Topology mapping for Blue Gene/L supercomputer
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
An efficient non-contiguous processor allocation strategy for 2D mesh connected multicomputers
Information Sciences: an International Journal
An evaluative study on the effect of contention on message latencies in large supercomputers
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Topology-aware task mapping for reducing communication contention on large parallel machines
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Analysis of topology-dependent MPI performance on Gemini networks
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.00 |
MPI application performance can vary based on the scheduler's placing of ranks, whether between nodes or on cores in the same multi-core chip. MPI applications, by default, are at the mercy of the application placement software decision that assigns nodes to a job. We describe herein the general approach of node ordering for allocation in a 3D torus, how it improved MPI application performance, even in the face of an anisotropic interconnect. We demonstrate, quantitatively, that our topologically-based ordering results in improved performance for several MPI applications running on a Top10 supercomputer.