Efficient Layering for High Speed Communication: Fast Messages 2.x

Authors:
Mario Lauria;Scott Pakin;Andrew A. Chien
Affiliations:
-;-;-
Venue:
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Year:
1998

Citing 0
Cited 24

The design and performance of a pluggable protocols framework for real-time distributed object computing middleware

IFIP/ACM International Conference on Distributed systems platforms
Efficient wire formats for high performance computing

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A middleware toolkit for client-initiated service specialization

ACM SIGOPS Operating Systems Review
ENSEMBLE: A Communication Layer for Embedded Multi-Processor Systems

OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Communication overlap in multi-tier parallel algorithms

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Applying patterns to develop a pluggable protocols framework for ORB middleware

Design patterns in communications software
Supporting high-performance I/O in QoS-enabled ORB middleware

Cluster Computing
Event Services in High Performance Systems

Cluster Computing
High Performance Network of PC Cluster Maestro

Cluster Computing
Native Data Representation: An Efficient Wire Format for High-Performance Distributed Computing

IEEE Transactions on Parallel and Distributed Systems
Portals 3.0: Protocol Building Blocks for Low Overhead Communication

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
High Performance Implementation of MPI for Myrinet

ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
A survey of messaging software issues and systems for Myrinet-based clusters

Cluster computing
High-speed I/O: the operating system as a signalling mechanism

NICELI '03 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications
Transport protocols for high performance

Communications of the ACM - Blueprint for the future of high-performance networking
Direct connect device core: design and applications

Integration, the VLSI Journal
Design and Evaluation of an HPVM-Based Windows NT Supercomputer

International Journal of High Performance Computing Applications
Publish-Subscribe for High-Performance Computing

IEEE Internet Computing
Studying the performance of overlapping communication and computation by active message: INUKTITUT case

PDCN'06 Proceedings of the 24th IASTED international conference on Parallel and distributed computing and networks
RISC: A resilient interconnection network for scalable cluster storage systems

Journal of Systems Architecture: the EUROMICRO Journal
On chip novel video streaming system for bi-network multicasting protocols

Integration, the VLSI Journal
Parallel and distributed computing on multidomain non-routable networks

International Journal of High Performance Computing and Networking
A multi-protocol communication architecture for metacomputing

ICCOM'06 Proceedings of the 10th WSEAS international conference on Communications
Exploiting multidomain non routable networks

ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe our experience designing, implementing, and evaluating two generations of our high performance communication library, Fast Messages (FM) for Myrinet. In FM 1.x, we designed a simple interface and provided guarantees of reliable and in-order delivery, and flow control. While this was a significant improvement over previous systems, it was not enough. Layering MPI atop FM 1.x showed that only about 20\% of the FM 1.x bandwidth could be delivered to higher level communication APIs. Our second generation communication layer, FM 2.0, addresses the identified problems, providing gather-scatter, interlayer scheduling, receiver flow control, as well as some convenient API features which simplify programming. FM 2.x can deliver 70-90\% to higher level APIs such as MPI. This is especially impressive as the absolute bandwidths delivered have increased nearly fourfold to 70 MB/s. We describe general issues encountered in matching two communication layers, and our solutions as embodied in FM 2.x.