Exploiting multiple heterogeneous networks to reduce communication costs in parallel programs

Authors:
JunSeong Kim;D. J. Lilja
Affiliations:
-;-
Venue:
HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
Year:
1997

Citing 8
Cited 4

Limits to low-latency communication on high-speed networks

ACM Transactions on Computer Systems (TOCS)
Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
Architectural requirements of parallel scientific applications with explicit communication

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing

PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Performance measurement and trace driven simulation of parallel CAD and numeric applications on a hypercube multicomputer

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Design and implementation of multicast operations for ATM-based high performance computing

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet

A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet
Latency analysis of TCP on an ATM network

WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference

Predicting parallel applications performance on non-dedicated cluster platforms

ICS '98 Proceedings of the 12th international conference on Supercomputing
Performance-Based Path Determination for Interprocessor Communication in Distributed Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Block-cyclic redistribution over heterogeneous networks

Cluster Computing
Efficient collective communication in distributed heterogeneous systems

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The different types of messages used by a parallel application program executing in a distributed system can each have unique characteristics so that no single communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. In this paper, we investigate how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes by dynamically selecting the best (lowest latency) network for each message based on the message size. We also show how to aggregate these multiple parallel networks into a single virtual network to further reduce the latency and increase the available bandwidth. We test this multiplexing and aggregation on a cluster of SGI multiprocessors interconnected with both Fibre Channel and Ethernet. We find that multiplexing between Ethernet and Fibre Channel can substantially reduce communication overhead in a synthetic benchmark compared to using either network alone. Aggregating these two networks into a single virtual network can further reduce communication delays for applications with many large messages. The best choice of either multiplexing or aggregation depends on the mix of message sizes in application program and the relative overheads of the two networks.