When the CRC and TCP checksum disagree
Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
The Virtual Interface Architecture
IEEE Micro
TCP offload is a dumb idea whose time has come
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
High performance VMM-bypass I/O in virtual machines
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
iWarp protocol kernel space software implementation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Open problems in network-aware data management in exa-scale computing and terabit networking era
Proceedings of the first international workshop on Network-aware data management
Hi-index | 0.00 |
Remote direct memory access (RDMA) allows for the minimization of CPU and memory bus loads associated with network I/O. The Transmission Control Protocol/Internet Protocol (TCP/IP)-based Internet Wide Area RDMA Protocol (iWARP) stack now makes RDMA available for Ethernet local area networks and wide area networks. As 10-Gb/s Ethernet becomes deployed in data centers and Ethernet link speeds continue to increase faster than the memory bus bandwidth, the capability of RDMA to eliminate all intrahost data copy operations related to network I/O makes it attractive for accelerating TCP/IP. Whereas RDMA network adapters offload iWARP/TCP functionality to dedicated hardware, we have designed an onloaded iWARP software implementation, called SoftRDMA, which runs on the host CPU, closely integrated with Linux® TCP kernel sockets. SoftRDMA offers asynchronous nonblocking user-space I/O. It enables iWARP with conventional Ethernet adapters, as well as in mixed iWARP hardware and software environments, facilitating RDMA system integration. Furthermore, SoftRDMA benefits client-server applications with asymmetric loads and high aggregate throughput on the server. With iWARP checksumming turned off, SoftRDMA delivers a throughput exceeding 5 Gb/s using a single CPU core. We suggest zero-copy transmission in software and hardware acceleration for the iWARP framing layer for even higher performance.