Chain replication for supporting high throughput and availability

Authors:
Robbert van Renesse;Fred B. Schneider
Affiliations:
FAST Search & Transfer ASA, Tromsøø, Norway and Department of Computer Science, Cornell University, Ithaca, New York;FAST Search & Transfer ASA, Tromsøø, Norway and Department of Computer Science, Cornell University, Ithaca, New York
Venue:
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Year:
2004

Citing 20
Cited 59

Implementing fault-tolerant services using the state machine approach: a tutorial

ACM Computing Surveys (CSUR)
Disconnected operation in the Coda File System

ACM Transactions on Computer Systems (TOCS)
File-system development with stackable layers

ACM Transactions on Computer Systems (TOCS) - Special issue on operating systems principles
The dangers of replication and a solution

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An adaptive data replication algorithm

ACM Transactions on Database Systems (TODS)
Flexible update propagation for weakly consistent replication

Proceedings of the sixteenth ACM symposium on Operating systems principles
The part-time parliament

ACM Transactions on Computer Systems (TOCS)
Byzantine generals in action: implementing fail-stop processors

ACM Transactions on Computer Systems (TOCS)
End-to-end arguments in system design

ACM Transactions on Computer Systems (TOCS)
The costs and limits of availability for replicated services

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Mariposa: A New Architecture for Distributed Data

Proceedings of the Tenth International Conference on Data Engineering
Competitive Hill-Climbing Strategies for Replica Placement in a Distributed File System

DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Dynamic Replica Placement for Scalable Content Delivery

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Are quorums an alternative for data replication?

ACM Transactions on Database Systems (TODS)
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Farsite: federated, available, and reliable storage for an incompletely trusted environment

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Replica management should be a game

EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
Distributed versioning: consistent replication for scaling back-end databases of dynamic content web sites

Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware

Fault-scalable Byzantine fault-tolerant services

Proceedings of the twentieth ACM symposium on Operating systems principles
The SMART way to migrate replicated stateful services

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Tashkent: uniting durability with transaction ordering for high-performance scalable database replication

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Availability of multi-object operations

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Antiquity: exploiting a secure log for wide-area distributed storage

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Tashkent+: memory-aware load balancing and update filtering in replicated databases

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Optimal inter-object correlation when replicating for availability

Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Niobe: A practical replication protocol

ACM Transactions on Storage (TOS)
Replication degree customization for high availability

Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
RADOS: a scalable, reliable storage service for petabyte-scale storage clusters

PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
FaTLease: scalable fault-tolerant lease negotiation with paxos

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Kinesis: A new approach to replica placement in distributed storage systems

ACM Transactions on Storage (TOS)
Configuration-space performance anomaly depiction

LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
FTRepMI: Fault-Tolerant, Sequentially-Consistent Object Replication for Grid Applications

ICDCN '09 Proceedings of the 10th International Conference on Distributed Computing and Networking
Combining Techniques to Reduce State Space and Prove Strong Properties

Electronic Notes in Theoretical Computer Science (ENTCS)
PADS: a policy architecture for distributed storage systems

NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Dynamic atomic storage without consensus

Proceedings of the 28th ACM symposium on Principles of distributed computing
FaTLease: scalable fault-tolerant lease negotiation with Paxos

Cluster Computing
A Unified Framework for Load Distribution and Fault-Tolerance of Application Servers

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Building reliable large-scale distributed systems: when theory meets practice

ACM SIGACT News
FAWN: a fast array of wimpy nodes

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The next 700 BFT protocols

Proceedings of the 5th European conference on Computer systems
Throughput optimal total order broadcast for cluster environments

ACM Transactions on Computer Systems (TOCS)
Object storage on CRAQ: high-throughput chain replication for read-mostly workloads

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Chain replication in theory and in practice

Proceedings of the 9th ACM SIGPLAN workshop on Erlang
Compound treatment of chained declustered replicas using a parallel btree for high scalability and availability

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Dynamic atomic storage without consensus

Journal of the ACM (JACM)
FAWN: a fast array of wimpy nodes

Communications of the ACM
Increasing performance in byzantine fault-tolerant systems with on-demand replica consistency

Proceedings of the sixth conference on Computer systems
Sierra: practical power-proportionality for data center storage

Proceedings of the sixth conference on Computer systems
Distributed and fault-tolerant execution framework for transaction processing

Proceedings of the 4th Annual International Conference on Systems and Storage
Windows Azure Storage: a highly available cloud storage service with strong consistency

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Don't settle for eventual: scalable causal consistency for wide-area storage with COPS

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Scalable real time data management for smart grid

Proceedings of the Middleware 2011 Industry Track Workshop
Adaptive and dynamic funnel replication in clouds

ACM SIGOPS Operating Systems Review
From paxos to CORFU: a flash-speed shared log

ACM SIGOPS Operating Systems Review
Ω meets paxos: leader election and stability without eventual timely links

DISC'05 Proceedings of the 19th international conference on Distributed Computing
Replication techniques for availability

Replication
Walnut: a unified cloud object store

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CORFU: a shared log design for flash clusters

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
HyperDex: a distributed, searchable key-value store

Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Towards fair sharing of block storage in a multi-tenant cloud

HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Gnothi: separating data and metadata for efficient and available storage replication

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Dynamic reconfiguration of primary/backup clusters

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
HyperDex: a distributed, searchable key-value store

ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Flex-KV: enabling high-performance and flexible KV systems

Proceedings of the 2012 workshop on Management of big data systems
Flat datacenter storage

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Key-attributes based optimistic data consistency maintenance method

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
ChainReaction: a causal+ consistent datastore based on chain replication

Proceedings of the 8th ACM European Conference on Computer Systems
Stronger semantics for low-latency geo-replicated storage

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Beyond block I/O: implementing a distributed shared log in hardware

Proceedings of the 6th International Systems and Storage Conference
Leveraging endpoint flexibility in data-intensive clusters

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
An approach for constructing private storage services as a unified fault-tolerant system

Journal of Systems and Software
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Consistency-based service level agreements for cloud storage

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Tango: distributed data structures over a shared log

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Leveraging sharding in the design of scalable replication protocols

Proceedings of the 4th annual Symposium on Cloud Computing
Orbe: scalable causal consistency using dependency matrices and physical clocks

Proceedings of the 4th annual Symposium on Cloud Computing
CORFU: A distributed shared log

ACM Transactions on Computer Systems (TOCS)

Quantified Score

Hi-index	0.02

Visualization

Abstract

Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation. Throughput, availability, and several object-placement strategies (including schemes based on distributed hash table routing) are discussed.