FaRM: fast remote memory

Authors:
Aleksandar Dragojević;Dushyanth Narayanan;Orion Hodson;Miguel Castro
Affiliations:
Microsoft Research;Microsoft Research;Microsoft Research;Microsoft Research
Venue:
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Year:
2014

Citing 35
Cited 0

Implementation and performance of Munin

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
TreadMarks: Shared Memory Computing on Networks of Workstations

Computer
The synergy between non-blocking synchronization and operating system structure

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Cashmere-2L: software coherent shared memory on a clustered remote-write network

Proceedings of the sixteenth ACM symposium on Operating systems principles
On optimistic methods for concurrency control

ACM Transactions on Database Systems (TODS)
Useless Actions Make a Difference: Strict Serializability of Database Updates

Journal of the ACM (JACM)
Hoard: a scalable memory allocator for multithreaded applications

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Transaction Processing: Concepts and Techniques

Transaction Processing: Concepts and Techniques
Cuckoo hashing

Journal of Algorithms
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
High performance RDMA-based MPI implementation over infiniBand

International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
Efficient routing for peer-to-peer overlays

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Parallel Programmability and the Chapel Language

International Journal of High Performance Computing Applications
Sinfonia: a new paradigm for building scalable distributed systems

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Hopscotch Hashing

DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
Early Experiences with Write-Write Design of NFS over RDMA

NAS '09 Proceedings of the 2009 IEEE International Conference on Networking, Architecture, and Storage
Benchmarking cloud serving systems with YCSB

Proceedings of the 1st ACM symposium on Cloud computing
Flat combining and the synchronization-parallelism tradeoff

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Providing a cloud network infrastructure on a supercomputer

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
ZooKeeper: wait-free coordination for internet-scale systems

USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Infiniband scalability in open MPI

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Fast crash recovery in RAMCloud

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Memcached Design on High Performance RDMA Capable Interconnects

ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Workload analysis of a large-scale key-value store

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Scalable Memcached Design for InfiniBand Clusters Using Hybrid Transports

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
High-Performance Design of HBase with RDMA over InfiniBand

IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
High performance RDMA-based design of HDFS over InfiniBand

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
LinkBench: a database benchmark based on the Facebook social graph

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
MemC3: compact and concurrent MemCache with dumber caching and smarter hashing

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Scaling Memcache at Facebook

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
TAO: Facebook's distributed data store for the social graph

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Using one-sided RDMA reads to build a fast, CPU-efficient key-value store

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the design and implementation of FaRM, a new main memory distributed computing platform that exploits RDMA to improve both latency and throughput by an order of magnitude relative to state of the art main memory systems that use TCP/IP. FaRM exposes the memory of machines in the cluster as a shared address space. Applications can use transactions to allocate, read, write, and free objects in the address space with location transparency. We expect this simple programming model to be sufficient for most application code. FaRM provides two mechanisms to improve performance where required: lock-free reads over RDMA, and support for collocating objects and function shipping to enable the use of efficient single machine transactions. FaRM uses RDMA both to directly access data in the shared address space and for fast messaging and is carefully tuned for the best RDMA performance. We used FaRM to build a key-value store and a graph store similar to Facebook's. They both perform well, for example, a 20-machine cluster can perform 167 million key-value lookups per second with a latency of 31µs.