Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections

Authors:
Jie Chen;William Watson III;Robert Edwards;Weizhen Mao
Affiliations:
HPC Group, Jefferson Lab, Newport News, VA;HPC Group, Jefferson Lab, Newport News, VA;Theory Group, Jefferson Lab, Newport News, VA;College of William and Mary, Williamsburg, VA
Venue:
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Year:
2005

Citing 18
Cited 3

Introduction to parallel algorithms and architectures: array, trees, hypercubes

Introduction to parallel algorithms and architectures: array, trees, hypercubes
U-Net: a user-level network interface for parallel and distributed computing

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
Approximation algorithms for structured communication problems

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Effects of communication latency, overhead, and bandwidth in a cluster architecture

Proceedings of the 24th annual international symposium on Computer architecture
Gigabit Ethernet: Technology and Applications for High-Speed LANs

Gigabit Ethernet: Technology and Applications for High-Speed LANs
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering

Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Interconnection Networks: An Engineering Approach

Interconnection Networks: An Engineering Approach
Design and implementation of FMPL, a fast message-passing library for remote memory operations

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Assessing Fast Network Interfaces

IEEE Micro
Scattering and Gathering Messages in Networks of Processors

IEEE Transactions on Computers
Global reduction in wormhole k-ary n-cube networks with multidestination exchange worms

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
PM/Ehernet-kRMA: A High Performance Remote Memory Access Facility Using Multiple Gigabit Ethernet Cards

CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Scalable NIC-based Reduction on Large-scale Clusters

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Beyond softnet

ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5

A scalable communication layer for multi-dimensional hyper crossbar network using multiple gigabit ethernet

Proceedings of the 20th annual international conference on Supercomputing
One-to-all personalized communication in torus networks

PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
High-performance message-passing over generic Ethernet hardware with Open-MX

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiple copper-based commodity Gigabit Ethernet (GigE) interconnects (adapters) on a single host can lead to Linux clusters with mesh/torus connections without using expensive switches and high speed network interconnects (NICs). However traditional message passing systems based on TCP for GigE will not perform well for this type of clusters because of the overhead of TCP for multiple GigE links. In this paper, we present two os-bypass message passing systems that are based on a modified M-VIA (an implementation of VIA specification) for two production GigE mesh clusters: one is constructed as a 4x8x8 (256 nodes) torus and has been in production use for a year; the other is constructed as a 6x8x8 (384 nodes) torus and was deployed recently. One of the message passing systems targets to a specific application domain and is called QMP and the other is an implementation of MPI specification 1.1. The GigE mesh clusters using these two message passing systems achieve about 18.5 驴s half-way round trip latency and 400MB/s total bandwidth, which compare reasonably well to systems using specialized high speed adapters in a switched architecture at much lower costs.