Achieving 10Gbps network processing: are we there yet?

Authors:
Priya Govindarajan;Srihari Makineni;Donald Newell;Ravi Iyer;Ram Huggahalli;Amit Kumar
Affiliations:
Intel Corporation, Hillsboro;Intel Corporation, Hillsboro;Intel Corporation, Hillsboro;Intel Corporation, Hillsboro;Intel Corporation, Hillsboro;Intel Corporation, Hillsboro
Venue:
HiPC'08 Proceedings of the 15th international conference on High performance computing
Year:
2008

Citing 14
Cited 0

Observing TCP dynamics in real networks

SIGCOMM '92 Conference proceedings on Communications architectures & protocols
The importance of non-data touching processing overheads in TCP/IP

SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Speculative Defragmentation - A Technique to Improve the Communication Software Efficiency for Gigabit Ethernet

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
TCP Onloading for Data Center Servers

Computer
Direct Cache Access for High Bandwidth Network I/O

Proceedings of the 32nd annual international symposium on Computer Architecture
Architectural Characterization of TCP/IP Packet Processing on the Pentium® M Microprocessor

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Hardware Support for Bulk Data Movement in Server Platforms

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
TCP performance re-visited

ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
Integrated network interfaces for high-bandwidth TCP/IP

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
TCP offload is a dumb idea whose time has come

HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Impact of Cache Coherence Protocols on the Processing of Network Traffic

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Receive side coalescing for accelerating TCP/IP processing

HiPC'06 Proceedings of the 13th international conference on High Performance Computing
On the impact of bursting on TCP performance

PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
End system optimizations for high-speed TCP

IEEE Communications Magazine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scaling TCP/IP receive side processing to 10Gbps speeds on commercialserver platforms has been a major challenge. This led to the development oftwo key techniques: Large Receive Offload (LRO) and Direct Cache Access(DCA). Only recently, systems supporting these two techniques have becomeavailable. So, we want to evaluate these two techniques using 10Gigabit NICs tofind out if we can finally get 10Gbps rates. We evaluate these two techniques indetail to understand performance benefit offered by these two techniques and theremaining major overheads. Our measurements showed that LRO and DCA togetherimprove TCP/IP receive performance by more than 50% over the base case(no LRO and DCA). These two techniques combined with the improvements inthe CPU architecture and the rest of the platform over the last 3-4 years have morethan doubled the TCP/IP receive processing throughput to 7Gbps. Our detailedarchitectural characterization of TCP/IP processing, with these two features enabled,has revealed that buffer management and copy operations still take up significantamount of processing time. We also analyze the scaling behavior ofTCP/IP to figure out how multi-core architectures improve network processing.This part of our analysis has highlighted some limiting factors that need to be addressedto achieve scaling beyond 10Gbps.