Server network scalability and TCP offload
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
TCP offload is a dumb idea whose time has come
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Server-efficient high-definition media dissemination
Proceedings of the 18th international workshop on Network and operating systems support for digital audio and video
Evaluating high performance communication: a power perspective
Proceedings of the 23rd international conference on Supercomputing
The case for RAMClouds: scalable high-performance storage entirely in DRAM
ACM SIGOPS Operating Systems Review
Cassandra: a decentralized structured storage system
ACM SIGOPS Operating Systems Review
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
A case for RDMA in clouds: turning supercomputer networking into commodity
Proceedings of the Second Asia-Pacific Workshop on Systems
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Wimpy nodes with 10GbE: leveraging one-sided operations in soft-RDMA to boost memcached
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Unified high-performance I/O: one stack to rule them all
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Using one-sided RDMA reads to build a fast, CPU-efficient key-value store
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Hi-index | 0.00 |
The performance of large-scale data-intensive applications running on thousands of machines depends considerably on the performance of the network. To deliver better application performance on rapidly evolving high-bandwidth, low-latency interconnects, researchers have proposed the use of network accelerator devices. However, despite the initial enthusiasm, translating network accelerator's capabilities into high application performance remains a challenging issue. In this paper, we describe our experience and discuss issues that we uncover with network acceleration using Remote Direct Memory Access (RDMA) capable network controllers (RNICs). RNICs offload the complete packet processing into network controllers, and provide direct userspace access to the networking hardware. Our analysis shows that multiple (un)related factors significantly influence the performance gains for the end-application. We identify factors that span the whole stack, ranging from low-level architectural issues (cache and DMA interaction, hardware pre-fetching) to the high-level application parameters (buffer size, access pattern). We discuss implications of our findings upon application performance and the future of integration of network acceleration technology within the systems.