Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Replication in the harp file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
High-Availability Computer Systems
Computer
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Inside ODBC
Efficient optimistic concurrency control using loosely synchronized clocks
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
HAC: hybrid adaptive caching for distributed storage systems
Proceedings of the sixteenth ACM symposium on Operating systems principles
Consistent object replication in the eternal system
Theory and Practice of Object Systems - Special issue high availability in CORBA
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
Replicated distributed programs
Proceedings of the tenth ACM symposium on Operating systems principles
NFS illustrated
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Program Development in Java: Abstraction, Specification, and Object-Oriented Design
Program Development in Java: Abstraction, Specification, and Object-Oriented Design
Transaction Processing: Concepts and Techniques
Transaction Processing: Concepts and Techniques
MetaKernels and Fault Containment Wrappers
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Providing Support for Survivable CORBA Applications with the Immune System
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
The modified object buffer: a storage management technique for object-oriented databases
The modified object buffer: a storage management technique for object-oriented databases
Proactive recovery in a Byzantine-fault-tolerant system
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Adding group communication and fault-tolerance to CORBA
COOTS'95 Proceedings of the USENIX Conference on Object-Oriented Technologies on USENIX Conference on Object-Oriented Technologies (COOTS)
Practical byzantine fault tolerance and proactive recovery
ACM Transactions on Computer Systems (TOCS)
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Synchronous Consensus for Dependent Process Failures
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Separating agreement from execution for byzantine fault tolerant services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Slingshot: deploying stateful services in wireless hotspots
Proceedings of the 3rd international conference on Mobile systems, applications, and services
Implementing Trustworthy Services Using Replicated State Machines
IEEE Security and Privacy
BAR fault tolerance for cooperative services
Proceedings of the twentieth ACM symposium on Operating systems principles
Detecting past and present intrusions through vulnerability-specific predicates
Proceedings of the twentieth ACM symposium on Operating systems principles
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
The design of a robust peer-to-peer system
EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
IEEE Transactions on Dependable and Secure Computing
The SMART way to migrate replicated stateful services
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Undo for operators: building an undoable e-mail store
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Rx: Treating bugs as allergies—a safe method to survive software failures
ACM Transactions on Computer Systems (TOCS)
Zyzzyva: speculative byzantine fault tolerance
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Attested append-only memory: making adversaries stick to their word
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Building bug-tolerant routers with virtualization
Proceedings of the ACM workshop on Programmable routers for extensible services of tomorrow
Zyzzyva: speculative Byzantine fault tolerance
Communications of the ACM - Remembering Jim Gray
Diverse replication for single-machine Byzantine-fault tolerance
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Efficient state transfer for hypervisor-based proactive recovery
Proceedings of the 2nd workshop on Recent advances on intrusiton-tolerant systems
CloudAV: N-version antivirus in the network cloud
SS'08 Proceedings of the 17th conference on Security symposium
Practical and low-overhead masking of failures of TCP-based servers
ACM Transactions on Computer Systems (TOCS)
Zeno: eventually consistent Byzantine-fault tolerance
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Tolerating latency in replicated state machines through client speculation
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Symmetric active/active metadata service for high availability parallel file systems
Journal of Parallel and Distributed Computing
Zyzzyva: Speculative Byzantine fault tolerance
ACM Transactions on Computer Systems (TOCS)
Consensus When All Processes May Be Byzantine for Some Time
SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Proceedings of the 5th European conference on Computer systems
ACM Transactions on Computer Systems (TOCS)
The byzantine empire in the intercloud
ACM SIGACT News
Byzantium: Byzantine-fault-tolerant database replication providing snapshot isolation
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
Prophecy: using history for high-throughput fault tolerance
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Tolerating file-system mistakes with EnvyFS
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Using allopoietic agents in replicated software to respond to errors, faults, and attacks
Proceedings of the 48th Annual Southeast Regional Conference
Increasing performance in byzantine fault-tolerant systems with on-demand replica consistency
Proceedings of the sixth conference on Computer systems
Efficient middleware for byzantine fault tolerant database replication
Proceedings of the sixth conference on Computer systems
ZZ and the art of practical BFT execution
Proceedings of the sixth conference on Computer systems
Beyond one-third faulty replicas in byzantine fault tolerant systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Parsimony-Based approach for obtaining resource-efficient and trustworthy execution
LADC'05 Proceedings of the Second Latin-American conference on Dependable Computing
Scalable testing of file system checkers
Proceedings of the 7th ACM european conference on Computer Systems
All about Eve: execute-verify replication for multi-core servers
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Iwazaru: the byzantine sequencer
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Towards transparent hardening of distributed systems
Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
HARDFS: hardening HDFS with selective and lightweight versioning
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Software errors are a major cause of outages and they are increasingly exploited in malicious attacks. Byzantine fault tolerance allows replicated systems to mask some software errors but it is expensive to deploy. This paper describes a replication technique, BASE, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors. BASE reduces cost because it enables reuse of off-the-shelf service implementations. It improves availability because each replica can be repaired periodically using an abstract view of the state stored by correct replicas, and because each replica can run distinct or non-deterministic service implementations, which reduces the probability of common mode failures. We built an NFS service where each replica can run a different off-the-shelf file system implementation, and an object-oriented database where the replicas ran the same, non-deterministic implementation. These examples suggest that our technique can be used in practice --- in both cases, the implementation required only a modest amount of new code, and our performance results indicate that the replicated services perform comparably to the implementations that they reuse.