Adaptive MPI multirail tuning for non-uniform input/output access

Authors:
Stéphanie Moreaud;Brice Goglin;Raymond Namyst
Affiliations:
Université de Bordeaux, INRIA, LaBRI, Talence Cedex, France;Université de Bordeaux, INRIA, LaBRI, Talence Cedex, France;Université de Bordeaux, INRIA, LaBRI, Talence Cedex, France
Venue:
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Year:
2010

Citing 8
Cited 1

Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
NewMadeleine: An efficient support for high-performance networks in MPICH2

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimizing MPI Runtime Parameter Settings by Using Machine Learning

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
MiAMI: Multi-core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces

HOTI '09 Proceedings of the 2009 17th IEEE Symposium on High Performance Interconnects
Impact of NUMA effects on high-speed networking with multi-opteron machines

PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications

PDP '10 Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing

Improvement of the bandwidth of cross-site MPI communication using optical fiber

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multicore processors have not only reintroduced Non-Uniform Memory Access (NUMA) architectures in nowadays parallel computers, but they are also responsible for non-uniform access times with respect to Input/Output devices (NUIOA). In clusters of multicore machines equipped with several network interfaces, performance of communication between processes thus depends on which cores these processes are scheduled on, and on their distance to the Network Interface Cards involved. We propose a technique allowing multirail communication between processes to carefully distribute data among the network interfaces so as to counterbalance NUIOA effects. We demonstrate the relevance of our approach by evaluating its implementation within OpenMPI on a Myri-10G + InfiniBand cluster.