A new fast message passing communication system for multiprocessor workstation clusters

Authors:
Zheng Weimin;Shen Jun;Wang Ju
Affiliations:
Tsinghua Univ., Beijing, China;Tsinghua Univ., Beijing, China;Tsinghua Univ., Beijing, China
Venue:
Progress in computer research
Year:
2001

Citing 7
Cited 0

Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The importance of non-data touching processing overheads in TCP/IP

SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
1995 observations on supercomputing alternatives: did the MPP bandwagon lead to a cul-de-sac?

Communications of the ACM
MPI-FM: high performance MPI on workstation clusters

Journal of Parallel and Distributed Computing - Special issue on workstation clusters and network-based computing
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Design and Implementation of Virtual Memory-Mapped Communication on Myrinet

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A Scalable Parallel Workstation Cluster System

APDC '97 Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a fast message passing communication architecture (FMP) is proposed in order to reduce the software overhead of communication, which has been a great impediment to the good performance of workstation clusters. A FMP system is implemented over the SUN Ultra2-Myrinet platform. Measurements show that this implementation has achieved a one-way latency of 11.2us for one-byte packets in network communication, and only 4.9us in local communication. Its bandwidth can reach as high as 338Mb/s for 8KB packets in network communication, and more than 770Mb/s in local message passing. These results tell us that FMP can really exploit the performance of supercomputers and high-speed networks. Local communication is implemented through shared memory within a single host, while the methods to decrease overheads of network communication depend on user-level communication protocol, pipeline transmission, credit flow control and multithreading. Furthermore, the whole system provides the interface for both network and local communication.