Architecture of a message-driven processor
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
T: a multithreaded massively parallel architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A tightly-coupled processor-network interface
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Proceedings of the 28th annual international symposium on Microarchitecture
Generating representative Web workloads for network and server performance evaluation
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
On the design of display processors
Communications of the ACM
Protected, user-level DMA for the SHRIMP network interface
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
On the elusive benefits of protocol offload
NICELI '03 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications
An Efficient Programmable 10 Gigabit Ethernet Network Interface Card
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Optimizing 10-Gigabit Ethernet for Networks of Workstations, Clusters, and Grids: A Case Study
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Direct Cache Access for High Bandwidth Network I/O
Proceedings of the 32nd annual international symposium on Computer Architecture
Performance Analysis of System Overheads in TCP/IP Workloads
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
The M5 Simulator: Modeling Networked Systems
IEEE Micro
Server network scalability and TCP offload
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
TCP offload is a dumb idea whose time has come
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Trapeze/IP: TCP/IP at near-gigabit speeds
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Blue Gene/L compute chip: memory and Ethernet subsystem
IBM Journal of Research and Development
Tapping into the fountain of CPUs: on operating system support for programmable devices
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
High-performance ethernet-based communications for future multi-core processors
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Achieving 10Gbps network processing: are we there yet?
HiPC'08 Proceedings of the 15th international conference on High performance computing
A new TCB cache to efficiently manage TCP sessions for web servers
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
EINIC: an architecture for high bandwidth network I/O on multi-core processors
Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Cost-effectively offering private buffers in SoCs and CMPs
Proceedings of the international conference on Supercomputing
Buffer-integrated-Cache: a cost-effective SRAM architecture for handheld and embedded platforms
Proceedings of the 48th Design Automation Conference
Analyzing performance and power efficiency of network processing over 10 GbE
Journal of Parallel and Distributed Computing
RDMA in the SiCortex cluster systems
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Thin servers with smart pipes: designing SoC accelerators for memcached
Proceedings of the 40th Annual International Symposium on Computer Architecture
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
This paper proposes new network interface controller (NIC) designs that take advantage of integration with the host CPU to provide increased flexibility for operating system kernel-based performance optimization.We believe that this approach is more likely to meet the needs of current and future high-bandwidth TCP/IP networking on end hosts than the current trend of putting more complexity in the NIC, while avoiding the need to modify applications and protocols. This paper presents two such NICs. The first, the simple integrated NIC (SINIC), is a minimally complex design that moves the responsibility for managing the network FIFOs from the NIC to the kernel. Despite this closer interaction between the kernel and the NIC, SINIC provides performance equivalent to a conventional DMA-based NIC without increasing CPU overhead. The second design, V-SINIC, adds virtual per-packet registers to SINIC, enabling parallel packet processing while maintaining a FIFO model. V-SINIC allows the kernel to decouple examining a packet's header from copying its payload to memory. We exploit this capability to implement a true zero-copy receive optimization in the Linux 2.6 kernel, providing bandwidth improvements of over 50% on unmodified sockets-based receive-intensive benchmarks.