Analyzing the impact of supporting out-of-order communication on in-order performance with iWARP

Authors:
P. Balaji;W. Feng;S. Bhagvat;D. K. Panda;R. Thakur;W. Gropp
Affiliations:
Argonne National Laboratory;Virginia Tech;Dell Inc.;Ohio State University;Argonne National Laboratory;Argonne National Laboratory
Venue:
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Year:
2007

Citing 13
Cited 0

Computation of cyclic redundancy checks via table look-up

Communications of the ACM
Parallel accelerated isocontouring for out-of-core visualization

PVGS '99 Proceedings of the 1999 IEEE symposium on Parallel visualization and graphics
When the CRC and TCP checksum disagree

Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
Parallel view-dependent isosurface extraction using multi-pass occlusion culling

PVG '01 Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
Distributed processing of very large datasets with DataCutter

Parallel Computing - Clusters and computational grids for scientific computing
Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks

IEEE Transactions on Computers
The Quadrics Network (QsNet): High-Performance Clustering Technology

HOTI '01 Proceedings of the The Ninth Symposium on High Performance Interconnects
Performance Modeling of Subnet Management on Fat Tree InfiniBand Networks using OpenSM

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 18 - Volume 19
Performance Characterization of a 10-Gigabit Ethernet TOE

HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
Exploiting NIC architectural support for enhancing IP-based protocols on high-performance networks

Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
TCP performance re-visited

ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
On the performance of TCP splicing for URL-aware redirection

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2
Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid

Quantified Score

Hi-index	0.00

Visualization

Abstract

Due to the growing need to tolerate network faults and congestion in high-end computing systems, supporting multiple network communication paths is becoming increasingly important. However, multi-path communication comes with the disadvantage of out-of-order arrival of packets (because packets may traverse different paths). While modern networking stacks such as the Internet Wide-Area RDMA Protocol (iWARP) over 10-Gigabit Ethernet (10GE) support multi-path communication, their current implementations do not handle out-of-order packets primarily owing to the overhead on in-order communication that it adds. Specifically, in iWARP, supporting out-of-order packets requires every packet to carry additional information causing significant overhead on packets that arrive in-order. Thus, in this paper, we analyze the trade-offs in designing a feature-complete iWARP stack, i.e., one that provides support for out-of-order arriving packets, and thus, multi-path systems, while focusing on the performance of in-order communication. We propose three feature-complete designs of iWARP and analyze the pros and cons of each of these designs using performance experiments based on several micro-benchmarks as well as an iso-surface visual rendering application. Our analysis reveals that the iWARP design providing the best overall performance depends on the particular characteristics of the upper layers and that different designs are optimal based on the metric of interest.