EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Feedback-Based Synchronization in System Area Networks for Cluster Computing
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
This paper describes a protocol for a general-purpose cluster communication system that supports multiprogramming with virtual networks, direct and protected network access, reliable message delivery using message timeouts and retransmissions, a powerful return-to-send error model for applications, and automatic network mapping. The protocols use simple, low-cost mechanisms that exploit properties of our interconnect without limiting flexibility, usability or robustness. We have implemented the protocols in an active message communication system that runs a network of 100+ Sun UltraSPARC workstations interconnected with 40 Myrinet switches. A progression of microbenchmarks demonstrate good performance -- 42 microsecond round-trip times and 31 MB/s node to node bandwidth -- as well as scalability under heavy load and graceful performance degradation in the presence of high contention.