Multi-core and network aware MPI topology functions

Authors:
Mohammad Javad Rashti;Jonathan Green;Pavan Balaji;Ahmad Afsahi;William Gropp
Affiliations:
Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada;Argonne National Laboratory, Argonne, IL;Queen's University, Kingston, ON, Canada;University of Illinois at Urbana-Champaign, IL
Venue:
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Year:
2011

Citing 8
Cited 3

Rank Reordering Strategy for MPI Topology Creation Functions

Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Implementing the MPI process topology mechanism

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
SMP-Aware Message Passing Programming

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications

PDP '10 Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing
Near-optimal placement of MPI processes on hierarchical NUMA architectures

Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
The scalable process topology interface of MPI 2.2

Concurrency and Computation: Practice & Experience
What MPI could (and cannot) do for mesh-partitioning on non-homogeneous networks

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface

Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Actor scheduling for multicore hierarchical memory platforms

Proceedings of the twelfth ACM SIGPLAN workshop on Erlang
Improving the performance of actor model runtime environments on multicore and manycore platforms

Proceedings of the 2013 workshop on Programming based on actors, agents, and decentralized control

Quantified Score

Hi-index	0.00

Visualization

Abstract

MPI standard offers a set of topology-aware interfaces that can be used to construct graph and Cartesian topologies for MPI applications. These interfaces have been mostly used for topology construction and not for performance improvement. To optimize the performance, in this paper we use graph embedding and node/network architecture discovery modules to match the communication topology of the applications to the physical topology of multi-core clusters with multi-level networks. Micro-benchmark results show considerable improvement in communication performance when using weighted and network-aware mapping. We also show that the implementation can improve communication and execution time of the applications.