Two algorithms for barrier synchronization
International Journal of Parallel Programming
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Optimization of MPI collectives on clusters of large-scale SMP's
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Advanced Computer Architecture: Parallelism,Scalability,Programmability
MPI and Embedded TCP/IP Gigabit Ethernet Cluster Computing
LCN '02 Proceedings of the 27th Annual IEEE Conference on Local Computer Networks
On optimizing collective communication
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Fast broadcast by the divide-and-conquer algorithm
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Hi-index | 0.00 |
In cluster computing, current communication functions under MPI library are not well optimized. Especially, the performance is worse if there are multiple sources and/or destinations involved, which are the cases of collective communication. Our algorithms uses multidimensional factorization and pairwise exchange communication/dissemination methods to improve the performance. They deliver better performance than previous algorithms such as ring, recursive doubling and dissemination algorithms. Experimental results show the improvement of 50% or so over MPICH version 1.2.6 on a Linux cluster.