LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel programming with MPI
MagPIe: MPI's collective communication operations for clustered wide area systems
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient Collective Communication in Distributed Heterogeneous Systems
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
High-resolution remote rendering of large datasets in a collaborative environment
Future Generation Computer Systems - iGrid 2002
An efficient collective communication method for grid scale networks
ICCS'03 Proceedings of the 2003 international conference on Computational science
Hi-index | 0.00 |
In this paper, we propose a packet-level parallel data transfer and a Two-Phase Scheduling(TPS) algorithm for collective communication primitives in MPICH-G2. The algorithms are characterized by two unique features: 1) a concurrent data transfer of packets from a source node to multiple destination nodes and 2) a scheduling of enhancing the performance of collective communications by early identification of bottleneck incurring nodes. The proposed technique is implemented and the performance improvement is measured. According to the performance evaluation, the proposed method has achieved about 20% performance improvement against conventional block data transfer methods when a binomial tree is used for the communication in LAN. In TPS algorithm, the distribution of messages to bottleneck incurring nodes is delayed to minimize the affection of the node to the total performance. Using TPS algorithm on WAN, significant performance improvement has also been achieved for various data sizes and number of nodes.