Leases: an efficient fault-tolerant mechanism for distributed file cache consistency
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Managing update conflicts in Bayou, a weakly connected replicated storage system
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Implementing global memory management in a workstation cluster
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Cluster-based scalable network services
Proceedings of the sixteenth ACM symposium on Operating systems principles
Frangipani: a scalable distributed file system
Proceedings of the sixteenth ACM symposium on Operating systems principles
Locality-aware request distribution in cluster-based network servers
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The Ninja architecture for robust Internet-scale systems and services373423
Computer Networks: The International Journal of Computer and Telecommunications Networking - pervasive computing
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
The IceCube approach to the reconciliation of divergent replicas
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Wide-area cooperative storage with CFS
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Middle-tier database caching for e-business
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Volume Leases for Consistency in Large-Scale Systems
IEEE Transactions on Knowledge and Data Engineering
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Using Leases to Support Server-Driven Consistency in Large-Scale Systems
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
ACM Computing Surveys (CSUR)
Practical uses of synchronized clocks in distributed systems
Distributed Computing
Autopilot: automatic data center management
ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
HOTDEP'06 Proceedings of the 2nd conference on Hot Topics in System Dependability - Volume 2
The case for a session state storage layer
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Session state: beyond soft state
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Scalable, distributed data structures for internet service construction
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Boxwood: abstractions as the foundation for storage infrastructure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
The Chubby lock service for loosely-coupled distributed systems
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Cooperative caching: using remote client memory to improve file system performance
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Paxos made live: an engineering perspective
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Sinfonia: a new paradigm for building scalable distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Volley: automated data placement for geo-distributed cloud services
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Stout: an adaptive interface to scalable cloud storage
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
An algorithm for implementing BFT registers in distributed systems with bounded churn
SSS'11 Proceedings of the 13th international conference on Stabilization, safety, and security of distributed systems
Journal of Computer and System Sciences
Opportunistic multipath forwarding in content-based publish/subscribe overlays
Proceedings of the 13th International Middleware Conference
Proceedings of the 4th annual Symposium on Cloud Computing
MillWheel: fault-tolerant stream processing at internet scale
Proceedings of the VLDB Endowment
A protocol for implementing byzantine storage in churn-prone distributed systems
Theoretical Computer Science
Hi-index | 0.00 |
Making cloud services responsive is critical to providing a compelling user experience. Many large-scale sites, including LinkedIn, Digg and Facebook, address this need by deploying pools of servers that operate purely on in-memory state. Unfortunately, current technologies for partitioning requests across these in-memory server pools, such as network load balancers, lead to a frustrating programming model where requests for the same state may arrive at different servers. Leases are a well-known technique that can provide a better programming model by assigning each piece of state to a single server. However, in-memory server pools host an extremely large number of items, and granting a lease per item requires fine-grained leasing that is not supported in prior datacenter lease managers. This paper presents Centrifuge, a datacenter lease manager that solves this problem by integrating partitioning and lease management. Centrifuge consists of a set of libraries linked in by the in-memory servers and a replicated state machine that assigns responsibility for data items (including leases) to these servers. Centrifuge has been implemented and deployed in production as part of Microsoft's Live Mesh, a large-scale commercial cloud service in continuous operation since April 2008. When cloud services within Mesh were built using Centrifuge, they required fewer lines of code and did not need to introduce their own subtle protocols for distributed consistency. As cloud services become ever more complicated, this kind of reduction in complexity is an increasingly urgent need.