Limits to low-latency communication on high-speed networks
ACM Transactions on Computer Systems (TOCS)
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Architectural requirements of parallel scientific applications with explicit communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Design and implementation of multicast operations for ATM-based high performance computing
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet
A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet
Latency analysis of TCP on an ATM network
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Predicting parallel applications performance on non-dedicated cluster platforms
ICS '98 Proceedings of the 12th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Block-cyclic redistribution over heterogeneous networks
Cluster Computing
Efficient collective communication in distributed heterogeneous systems
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
The different types of messages used by a parallel application program executing in a distributed system can each have unique characteristics so that no single communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. In this paper, we investigate how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes by dynamically selecting the best (lowest latency) network for each message based on the message size. We also show how to aggregate these multiple parallel networks into a single virtual network to further reduce the latency and increase the available bandwidth. We test this multiplexing and aggregation on a cluster of SGI multiprocessors interconnected with both Fibre Channel and Ethernet. We find that multiplexing between Ethernet and Fibre Channel can substantially reduce communication overhead in a synthetic benchmark compared to using either network alone. Aggregating these two networks into a single virtual network can further reduce communication delays for applications with many large messages. The best choice of either multiplexing or aggregation depends on the mix of message sizes in application program and the relative overheads of the two networks.