Distributed caching with memcached
Linux Journal
Zero-copy protocol for MPI using infiniband unreliable datagram
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Unifying UPC and MPI runtimes: experience with MVAPICH
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Memcached Design on High Performance RDMA Capable Interconnects
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Using one-sided RDMA reads to build a fast, CPU-efficient key-value store
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
KV-Cache: A Scalable High-Performance Web-Object Cache for Manycore
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
Mem cached is a general-purpose key-value based distributed memory object caching system. It is widely used in data-center domain for caching results of database calls, API calls or page rendering. An efficient Mem cached design is critical to achieve high transaction throughput and scalability. Previous research in the field has shown that the use of high performance interconnects like InfiniBand can dramatically improve the performance of Mem cached. The Reliable Connection (RC) is the most commonly used transport model for InfiniBand implementations. However, it has been shown that RC transport imposes scalability issues due to high memory consumption per connection. Such a characteristic is not favorable for middle wares like Mem cached, where the server is required to serve thousands of clients. The Unreliable Datagram (UD) transport offers higher scalability, but has several other limitations, which need to be efficiently handled. In this context, we introduce a hybrid transport model which takes advantage of the best features of RC and UD to deliver scalability and performance higher than that of a single-transport. To the best of our knowledge, this is the first effort aimed at studying the impact of using a hybrid of multiple transport protocols on Mem cached performance. We present comprehensive performance analysis using micro benchmarks, application benchmarks and realistic industry workloads. Our performance evaluations reveal that our Hybrid transport delivers performance comparable to that of RC, while maintaining a steady memory footprint. Mem cached Get latency for 4byte data size, is 4.28µs and 4.86µs for RC and hybrid transports, respectively. This represents a factor of twelve improvement over the performance of SDP. In evaluations using Apache Olio benchmark with 1,024 clients, Mem cached execution time using RC, UD and hybrid transports are 1.61, 1.96 and 1.70 seconds, respectively. Further, our scalability analysis with 4,096 client connections reveal that our proposed hybrid transport achieves good memory scalability.