XTP: the Xpress Transfer Protocol
XTP: the Xpress Transfer Protocol
High-performance switching with fibre channel
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The importance of non-data touching processing overheads in TCP/IP
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Fbufs: a high-bandwidth cross-domain transfer facility
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Software overhead in messaging layers: where does the time go?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Effects of buffering semantics on I/O performance
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Runtime mechanisms for efficient dynamic multithreading
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
Profiling and reducing processing overheads in TCP/IP
IEEE/ACM Transactions on Networking (TON)
MPI-FM: high performance MPI on workstation clusters
Journal of Parallel and Distributed Computing - Special issue on workstation clusters and network-based computing
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
Supporting High Level Programming with High Performance: The Illinois Concert System
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Cut-through delivery in Trapeze: An exercise in low-latency messaging
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
A Software Architecture for Global Address Space Communication on Clusters: Put/Get on Fast Messages
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Incorporating Memory Management into User-Level Network Interfaces
Incorporating Memory Management into User-Level Network Interfaces
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
High-performance local area communication with fast sockets
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
An Architecture for Using Multiple Communication Devices in a MPI Library
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Messaging on Gigabit Ethernet: Some Experiments with GAMMA and Other Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Hi-index | 0.00 |
We describe our experience of designing, implementing, and evaluating two generations of high performance communication libraries, Fast Messages (FM) for Myrinet. In FM 1, we designed a simple interface and provided guarantees of reliable and in-order delivery, and flow control. While this was a significant improvement over previous systems, it was not enough. Layering MPI atop FM 1 showed that only about 35% of the FM 1 bandwidth could be delivered to higher level communication APIs. Our second generation communication layer, FM 2, addresses the identified problems, providing gather-scatter, interlayer scheduling, receiver flow control, as well as some convenient API features which simplify programming. FM 2 can deliver 55–95% to higher level APIs such as MPI. This is especially impressive as the absolute bandwidths delivered have increased over fourfold to 90 MB/s. We describe general issues encountered in matching two communication layers, and our solutions as embodied in FM 2.