UNIX network programming
Efficient message passing interface (MPI) for parallel computing on clusters of workstations
Journal of Parallel and Distributed Computing - Special issue on workstation clusters and network-based computing
Computer networks: a systems approach
Computer networks: a systems approach
Multicast transport protocols: a survey and taxonomy
IEEE Communications Magazine
High-Performance Computing: Past, Present, and Future
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Hi-index | 0.00 |
When running parallel programs on clusters of individual computers or workstations, network communication is often the performance bottleneck. Since the round-trip time for a network packet is orders of magnitude larger than the amount of time it takes for an equivalent amount of data to be transferred from memory, methods which reduce network usage can result in significant performance improvements for parallel programs.This work demonstrates that broadcast performance can be improved by a significant factor using a portable reliable multicasting protocol compared to unicasting, which is typically used. Our end-product is an MPICH patch that does not require kernel modification. It is therefore portable to any UNIX-based system. MPICH is a popular, portable MPI implementation provided by Argonne National Laboratories (ANL). Since absolute reliability is critical for data integrity when broadcasting messages on clusters, our multicasting protocol also addresses reliability issues.