Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Replication in the harp file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
High-Availability Computer Systems
Computer
Lightweight causal and atomic group multicast
ACM Transactions on Computer Systems (TOCS)
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
Inside ODBC
Efficient optimistic concurrency control using loosely synchronized clocks
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
HAC: hybrid adaptive caching for distributed storage systems
Proceedings of the sixteenth ACM symposium on Operating systems principles
Consistent object replication in the eternal system
Theory and Practice of Object Systems - Special issue high availability in CORBA
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
Replicated distributed programs
Proceedings of the tenth ACM symposium on Operating systems principles
NFS illustrated
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Program Development in Java: Abstraction, Specification, and Object-Oriented Design
Program Development in Java: Abstraction, Specification, and Object-Oriented Design
Practical byzantine fault tolerance and proactive recovery
ACM Transactions on Computer Systems (TOCS)
Transaction Processing: Concepts and Techniques
Transaction Processing: Concepts and Techniques
The Implementation of POSTGRES
IEEE Transactions on Knowledge and Data Engineering
Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Portable Checkpointing for Heterogeneous Archtitectures
FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
MetaKernels and Fault Containment Wrappers
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Improving the Scalability of Fault-Tolerant Database Clusters
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
The Ensemble System
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Providing Support for Survivable CORBA Applications with the Immune System
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
The modified object buffer: a storage management technique for object-oriented databases
The modified object buffer: a storage management technique for object-oriented databases
Proactive recovery in a Byzantine-fault-tolerant system
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Adding group communication and fault-tolerance to CORBA
COOTS'95 Proceedings of the USENIX Conference on Object-Oriented Technologies on USENIX Conference on Object-Oriented Technologies (COOTS)
MIDDLE-R: Consistent database replication at the middleware level
ACM Transactions on Computer Systems (TOCS)
BTS: a Byzantine fault-tolerant tuple space
Proceedings of the 2006 ACM symposium on Applied computing
Worm-IT - A wormhole-based intrusion-tolerant group communication system
Journal of Systems and Software
Survey of research towards robust peer-to-peer networks: search methods
Computer Networks: The International Journal of Computer and Telecommunications Networking
A Parsimonious Approach for Obtaining Resource-Efficient and Trustworthy Execution
IEEE Transactions on Dependable and Secure Computing
Tolerating byzantine faults in transaction processing systems using commit barrier scheduling
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
PeerReview: practical accountability for distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Fault Tolerance via Diversity for Off-the-Shelf Products: A Study with SQL Database Servers
IEEE Transactions on Dependable and Secure Computing
DepSpace: a byzantine fault-tolerant coordination service
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Optimizing Threshold Protocols in Adversarial Structures
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
Design and implementation of a Byzantine fault tolerance framework for Web services
Journal of Systems and Software
Replication predicates for dependent-failure algorithms
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Behavioral distance for intrusion detection
RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
Behavioral distance measurement using hidden markov models
RAID'06 Proceedings of the 9th international conference on Recent Advances in Intrusion Detection
On the practicality of practical Byzantine fault tolerance
Proceedings of the 13th International Middleware Conference
On the efficiency of durable state machine replication
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Hi-index | 0.00 |
Software errors are a major cause of outages and they are increasingly exploited in malicious attacks. Byzantine fault tolerance allows replicated systems to mask some software errors but it is expensive to deploy. This paper describes a replication technique, BASE, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors. BASE reduces cost because it enables reuse of off-the-shelf service implementations. It improves availability because each replica can be repaired periodically using an abstract view of the state stored by correct replicas, and because each replica can run distinct or nondeterministic service implementations, which reduces the probability of common mode failures. We built an NFS service where each replica can run a different off-the-shelf file system implementation, and an object-oriented database where the replicas ran the same, nondeterministic implementation. These examples suggest that our technique can be used in practice---in both cases, the implementation required only a modest amount of new code, and our performance results indicate that the replicated services perform comparably to the implementations that they reuse.