Integrated network interfaces for high-bandwidth TCP/IP
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
The Journal of Supercomputing
High performance and scalable I/O virtualization via self-virtualized devices
Proceedings of the 16th international symposium on High performance distributed computing
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design
Proceedings of the 23rd international conference on Supercomputing
Predictive-flow-queue-based energy optimization for gigabit ethernet controllers
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
This paper explores the hardware and software mechanisms necessary for an efficient programmable 10 Gigabit Ethernet network interface card. Network interface processing requires support for the following characteristics: a large volume of frame data, frequently accessed frame metadata, and high frame rate processing. This paper proposes three mechanisms to improve programmable network interface efficiency. First, a partitioned memory organization enables low-latency access to control data and high-bandwidth access to frame contents from a high-capacity memory. Second, a novel distributed task-queue mechanism enables parallelization of frame processing across many low-frequency cores, while using software to maintain total frame ordering. Finally, the addition of two new atomic read-modify-write instructions reduces frame ordering overheads by 50%. Combining these hardware and software mechanisms enables a network interface card to saturate cores and 4 banks of on-chip SRAM operating at a full-duplex 10 Gb/s Ethernet link by utilizing 6 processor 166 MHz, along with external 500 MHz GDDR SDRAM.