Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Approximation algorithms for structured communication problems
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
Gigabit Ethernet: Technology and Applications for High-Speed LANs
Gigabit Ethernet: Technology and Applications for High-Speed LANs
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
Design and implementation of FMPL, a fast message-passing library for remote memory operations
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Assessing Fast Network Interfaces
IEEE Micro
Scattering and Gathering Messages in Networks of Processors
IEEE Transactions on Computers
Global reduction in wormhole k-ary n-cube networks with multidestination exchange worms
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Scalable NIC-based Reduction on Large-scale Clusters
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5
Proceedings of the 20th annual international conference on Supercomputing
One-to-all personalized communication in torus networks
PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
High-performance message-passing over generic Ethernet hardware with Open-MX
Parallel Computing
Hi-index | 0.00 |
Multiple copper-based commodity Gigabit Ethernet (GigE) interconnects (adapters) on a single host can lead to Linux clusters with mesh/torus connections without using expensive switches and high speed network interconnects (NICs). However traditional message passing systems based on TCP for GigE will not perform well for this type of clusters because of the overhead of TCP for multiple GigE links. In this paper, we present two os-bypass message passing systems that are based on a modified M-VIA (an implementation of VIA specification) for two production GigE mesh clusters: one is constructed as a 4x8x8 (256 nodes) torus and has been in production use for a year; the other is constructed as a 6x8x8 (384 nodes) torus and was deployed recently. One of the message passing systems targets to a specific application domain and is called QMP and the other is an implementation of MPI specification 1.1. The GigE mesh clusters using these two message passing systems achieve about 18.5 驴s half-way round trip latency and 400MB/s total bandwidth, which compare reasonably well to systems using specialized high speed adapters in a switched architecture at much lower costs.