TCP/IP illustrated (vol. 1): the protocols
TCP/IP illustrated (vol. 1): the protocols
Direct Cache Access for High Bandwidth Network I/O
Proceedings of the 32nd annual international symposium on Computer Architecture
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Architectural Characterization of Processor Affinity in Network Processing
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Efficient Translation of Algorithmic Kernels on Large-Scale Multi-cores
CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 02
MiAMI: Multi-core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces
HOTI '09 Proceedings of the 2009 17th IEEE Symposium on High Performance Interconnects
IsoStack: highly efficient network processing on dedicated cores
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Improving network connection locality on multicore systems
Proceedings of the 7th ACM european conference on Computer Systems
A Transport-Friendly NIC for Multicore/Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Minimizing the Data Transfer Time Using Multicore End-System Aware Flow Bifurcation
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Operating systems abstractions for software packet processing in datacenters
Operating systems abstractions for software packet processing in datacenters
On the core affinity and file upload performance of Hadoop
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
Characterizing the impact of end-system affinities on the end-to-end performance of high-speed flows
NDM '13 Proceedings of the Third International Workshop on Network-Aware Data Management
Hi-index | 0.00 |
For a given TCP or UDP flow, protocol processing of incoming packets is performed on the core that receives the interrupt, while the user-space application which consumes the data may run on the same or a different core. If the cores are not the same, additional costs due to context switches, cache misses, and the movement of data between the caches of the cores may occur. The magnitude of this cost depends upon the processor affinity of the user-space process relative to the network stack. In this paper we present a prototype implementation of a tool which enables the application processing and protocol processing to occur on cores which share the lowest cache level. The Cache-Aware Affinity Deamon (CAAD) analyzes the topology of the die and the NIC characteristics and conveys information to the sender which allows the entire end-to-end path for each new flow to be be managed and controlled. This is done in a light-weight manner for both uni and bi-directional flows. Measurements show that for bulk data transfers using commodity multicore machines, the use of CAAD improves the overall TCP throughput by as much as 31%, and reduces the cache miss rate as much as 37.5%. GridFTP combined with CAAD improves the download time for big file transfers by up to 18%.