Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
NewMadeleine: An efficient support for high-performance networks in MPICH2
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimizing MPI Runtime Parameter Settings by Using Machine Learning
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
MiAMI: Multi-core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces
HOTI '09 Proceedings of the 2009 17th IEEE Symposium on High Performance Interconnects
Impact of NUMA effects on high-speed networking with multi-opteron machines
PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications
PDP '10 Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing
Improvement of the bandwidth of cross-site MPI communication using optical fiber
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Hi-index | 0.00 |
Multicore processors have not only reintroduced Non-Uniform Memory Access (NUMA) architectures in nowadays parallel computers, but they are also responsible for non-uniform access times with respect to Input/Output devices (NUIOA). In clusters of multicore machines equipped with several network interfaces, performance of communication between processes thus depends on which cores these processes are scheduled on, and on their distance to the Network Interface Cards involved. We propose a technique allowing multirail communication between processes to carefully distribute data among the network interfaces so as to counterbalance NUIOA effects. We demonstrate the relevance of our approach by evaluating its implementation within OpenMPI on a Myri-10G + InfiniBand cluster.