The X-Kernel: An Architecture for Implementing Network Protocols
IEEE Transactions on Software Engineering
Locking effects in multiprocessor implementations of protocols
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
ACM SIGOPS Operating Systems Review
Queue pair IP: a hybrid architecture for system area networks
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Increasing web server throughput with network interface data caching
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Can User-Level Protocols Take Advantage of Multi-CPU NICs?
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Trapeze/IP: TCP/IP at near-gigabit speeds
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Spinach: a liberty-based simulator for programmable network interface architectures
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Isolating the performance impacts of network interface cards through microbenchmarks
Proceedings of the joint international conference on Measurement and modeling of computer systems
Network Interface Data Caching
IEEE Transactions on Computers
RiceNIC: a reconfigurable network interface for experimental research and education
Proceedings of the 2007 workshop on Experimental computer science
Proceedings of the first international conference on Networks for grid applications
Network interfaces for programmable NICs and multicore platforms
Computer Networks: The International Journal of Computer and Telecommunications Networking
Leveraging bandwidth improvements to web servers through enhanced network interfaces
The Journal of Supercomputing
Hi-index | 0.00 |
Programmable network interfaces provide the potential to extend the functionality of network services but lead to instruction processing overheads when compared to application-specific network interfaces. This paper aims to offset those performance disadvantages by exploiting task-level concurrency in the workload to parallelize the network interface firmware for a programmable controller with two processors. By carefully partitioning the handler procedures that process various events related to the progress of a packet, the system can minimize sharing, achieve load balance, and efficiently utilize on-chip storage. Compared to the uniprocessor firmware released by the manufacturer, the parallelized network interface firmware increases throughput by 65% for bidirectional UDP traffic of maximum-sized packets, 157% for bidirectional UDP traffic of minimum-sized packets, and 32--107% for real network services. This parallelization results in performance within 10--20% of a modern ASIC-based network interface for real network services.