MPI and Embedded TCP/IP Gigabit Ethernet Cluster Computing
LCN '02 Proceedings of the 27th Annual IEEE Conference on Local Computer Networks
A Performance Analysis of the iSCSI Protocol
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Study on Enhanced Strategies for TCP/IP Offload Engines
ICPADS '05 Proceedings of the 11th International Conference on Parallel and Distributed Systems - Volume 01
Performance Characterization of a 10-Gigabit Ethernet TOE
HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
An Efficient Linux Kernel Module supporting TCP/IP Offload Engine on Grid
GCC '06 Proceedings of the Fifth International Conference on Grid and Cooperative Computing
Server network scalability and TCP offload
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Studying network protocol offload with emulation: approach and preliminary results
HOTI '04 Proceedings of the High Performance Interconnects, 2004. on Proceedings. 12th Annual IEEE Symposium
Analysis of TCP/IP protocol stack for a hybrid TCP/IP offload engine
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Implementation of a hybrid TCP/IP offload engine prototype
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
This paper presents the design and implementation of a protocol offload engine that processes TCP/IP and remote direct memory access (RDMA) protocols by means of hardware/software coprocessing. In the offload engine, time-consuming operations such as TCP/IP header generation are implemented as hardware to improve performance. The software performs control operations and RDMA header generation. In the experiments and analyses, it is proved that the hardware can provide satisfactory performance to process all operations at speeds of over 1Gbps. Our engine can offload most protocol processing overheads - up to 95% to 100% - from the host CPU. Finally, although the embedded processors operate with a 300MHz clock that is seven times slower than the clock of the host CPU, our engine shows maximum bandwidths of 673Mbps for TCP/IP and 551Mbps for RDMA on a gigabit Ethernet network.