HIPPI world—the switch is the network
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Analyzing communication latency using the Nectar communication processor
SIGCOMM '92 Conference proceedings on Communications architectures & protocols
Fbufs: a high-bandwidth cross-domain transfer facility
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Experiences with a high-speed network adaptor: a software perspective
SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
User-space protocols deliver high performance to applications on a low-cost Gb/s LAN
SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
Architecture and evaluation of a high-speed networking subsystem for distributed-memory systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A Host Interface Architecture for High-Speed Networks
Proceedings of the IFIP TC6/WG6.4 Fourth International Conference on High Performance Networking IV
Distributing a chemical process optimization application over a gigabit network
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Fine grain parallel communication on general purpose LANs
ICS '96 Proceedings of the 10th international conference on Supercomputing
Network-Based Multicomputers: A Practical Supercomputer Architecture
IEEE Transactions on Parallel and Distributed Systems
A high-speed network interface for distributed-memory systems: architecture and applications
ACM Transactions on Computer Systems (TOCS)
Increasing web server throughput with network interface data caching
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Application provided checksums
ICCC '02 Proceedings of the 15th international conference on Computer communication
Copy Emulation in Checksummed, Multiple-Packet Communication
INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
Spinach: a liberty-based simulator for programmable network interface architectures
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Network Interface Data Caching
IEEE Transactions on Computers
TCP offload through connection handoff
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
10Gb/s Ethernet performance and retrospective
ACM SIGCOMM Computer Communication Review
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Connection handoff policies for TCP offload network interfaces
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Optimizing TCP receive performance
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
CacheCard: caching static and dynamic content on the NIC
Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Hi-index | 0.00 |
Data copying and checksumming are the most expensive operations when doing high-bandwidth network IO over a high-speed network. Under some conditions, outboard buffering and checksumming can eliminate accesses to the data, thus making communication less expensive and faster. One of the scenarios in which outboard buffering pays off is the common case of applications accessing the network using the Berkeley sockets interface and the Internet protocol stack. In this paper we describe the changes that were made to a BSD protocol stack to make use of a network adaptor that supports outboard buffering and checksumming. Our goal is not only to achieve "single copy" communication for application that use sockets, but to also have efficient communication for in-kernel applications and for applications using other networks. Performance measurements show that for large reads and writes the single-copy path through the stack is significantly more efficient than the original implementation.