STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
ACM Transactions on Computer Systems (TOCS)
SEDA: an architecture for well-conditioned, scalable internet services
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Distributed Algorithms
Revisiting the Paxos Algorithm
WDAG '97 Proceedings of the 11th International Workshop on Distributed Algorithms
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
CRUSH: controlled, scalable, decentralized placement of replicated data
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Chain replication for supporting high throughput and availability
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Flash: an efficient and portable web server
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Paxos made live: an engineering perspective
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Programming distributed erlang applications: pitfalls and recipes
ERLANG '07 Proceedings of the 2007 SIGPLAN workshop on ERLANG Workshop
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Object storage on CRAQ: high-throughput chain replication for read-mostly workloads
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Semi-formal development of a fault-tolerant leader election protocol in erlang
FATES'04 Proceedings of the 4th international conference on Formal Approaches to Software Testing
Hi-index | 0.03 |
When implementing a distributed storage system, using an algorithm with a formal definition and proof is a wise idea. However, translating any algorithm into effective code can be difficult because the implementation must be both correct and fast. This paper is a case study of the implementation of the chain replication protocol in a distributed key-value store called Hibari. In theory, the chain replication algorithm is quite simple and should be straightforward to implement correctly. In practice, however, there were many implementation details that had effects both profound and subtle. The Erlang community, as well as distributed systems implementors in general, can use the lessons learned with Hibari (specifically in areas of performance enhancements and failure detection) to avoid many dangers that lurk at the interface between theory and real-world computing.