Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
A Case for NOW (Networks of Workstations)
IEEE Micro
GAMMA and MPI/GAMMA on Gigabit Ethernet
Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Proceedings of the 20th annual international conference on Supercomputing
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Toward Efficient Support for Multithreaded MPI Communication
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design
Proceedings of the 23rd international conference on Supercomputing
High Throughput Intra-Node MPI Communication with Open-MX
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
NIC-Assisted Cache-Efficient Receive Stack for Message Passing over Ethernet
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Remote Direct Memory Access over the Converged Enhanced Ethernet Fabric: Evaluating the Options
HOTI '09 Proceedings of the 2009 17th IEEE Symposium on High Performance Interconnects
Invited Performance of the communication layers of TCP/IP with the Myrinet gigabit LAN
Computer Communications
An analysis of TCP processing overhead
IEEE Communications Magazine
Kernel-based offload of collective operations: implementation, evaluation and lessons learned
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
clOpenCL: supporting distributed heterogeneous computing in HPC clusters
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Xen2MX: towards high-performance communication in the cloud
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Supercomputing with commodity CPUs: are mobile SoCs ready for HPC?
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
dOpenCL: Towards uniform programming of distributed heterogeneous multi-/many-core systems
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
In the last decade, cluster computing has become the most popular high-performance computing architecture. Although numerous technological innovations have been proposed to improve the interconnection of nodes, many clusters still rely on commodity Ethernet hardware to implement message-passing within parallel applications. We present Open-MX, an open-source message-passing stack over generic Ethernet. It offers the same abilities as the specialized Myrinet Express stack, without requiring dedicated support from the networking hardware. Open-MX works transparently in the most popular MPI implementations through its MX interface compatibility. It also enables interoperability between hosts running the specialized MX stack and generic Ethernet hosts. We detail how Open-MX copes with the inherent limitations of the Ethernet hardware to satisfy the requirements of message-passing by applying an innovative copy offload model. Combined with a careful tuning of the fabric and of the MX wire protocol, Open-MX achieves better performance than TCP implementations, especially on 10 gigabit/s hardware.