Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
Fundamentals of operating systems (5th ed.)
Fundamentals of operating systems (5th ed.)
The TickerTAIP parallel RAID architecture
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Scheduling algorithms for modern disk drives
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The Zebra striped network file system
ACM Transactions on Computer Systems (TOCS)
Serverless network file systems
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Petal: distributed virtual disks
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Windows NT file system internals: a developer's guide
Windows NT file system internals: a developer's guide
ACM Transactions on Computer Systems (TOCS)
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Distributed Algorithms
VIEWSTAMPED REPLICATION FOR HIGHLY AVAILABLE DISTRIBUTED SYSTEMS
VIEWSTAMPED REPLICATION FOR HIGHLY AVAILABLE DISTRIBUTED SYSTEMS
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Separating agreement from execution for byzantine fault tolerant services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
FAB: building distributed enterprise disk arrays from commodity components
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Reliable Distributed Systems: Technologies, Web Services, and Applications
Reliable Distributed Systems: Technologies, Web Services, and Applications
Fault-scalable Byzantine fault-tolerant services
Proceedings of the twentieth ACM symposium on Operating systems principles
The SMART way to migrate replicated stateful services
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Boxwood: abstractions as the foundation for storage infrastructure
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
HQ replication: a hybrid quorum protocol for byzantine fault tolerance
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
The Chubby lock service for loosely-coupled distributed systems
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Paxos made live: an engineering perspective
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Zyzzyva: speculative byzantine fault tolerance
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
An analysis of data corruption in the storage stack
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
A nine year study of file system and storage benchmarking
ACM Transactions on Storage (TOS)
The Bayou Architecture: Support for Data Sharing Among Mobile Users
WMCSA '94 Proceedings of the 1994 First Workshop on Mobile Computing Systems and Applications
DRAM errors in the wild: a large-scale field study
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Microsoft SQL Server 2008 Internals
Microsoft SQL Server 2008 Internals
An efficient algorithm for exploiting multiple arithmetic units
IBM Journal of Research and Development
Everest: scaling down peak loads through I/O off-loading
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs
Proceedings of the sixth conference on Computer systems
Detecting failures in distributed systems with the Falcon spy network
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
CORFU: a shared log design for flash clusters
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Gnothi: separating data and metadata for efficient and available storage replication
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
ACM SIGOPS Operating Systems Review
Stronger semantics for low-latency geo-replicated storage
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Robustness in the Salus scalable block store
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Transaction chains: achieving serializability with low latency in geo-distributed storage systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Tango: distributed data structures over a shared log
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Optimizing Paxos with request exchangeability for highly available web services
Proceedings of the 5th Asia-Pacific Symposium on Internetware
On the efficiency of durable state machine replication
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Hi-index | 0.00 |
Conventional wisdom holds that Paxos is too expensive to use for high-volume, high-throughput, data-intensive applications. Consequently, fault-tolerant storage systems typically rely on special hardware, semantics weaker than sequential consistency, a limited update interface (such as append-only), primary-backup replication schemes that serialize all reads through the primary, clock synchronization for correctness, or some combination thereof. We demonstrate that a Paxos-based replicated state machine implementing a storage service can achieve performance close to the limits of the underlying hardware while tolerating arbitrary machine restarts, some permanent machine or disk failures and a limited set of Byzantine faults. We also compare it with two versions of primary-backup. The replicated state machine can serve as the data store for a file system or storage array. We present a novel algorithm for ensuring read consistency without logging, along with a sketch of a proof of its correctness.