Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Hypervisor-based fault tolerance
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Loose Synchronization of Multithreaded Replicas
SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
High Throughput Byzantine Fault Tolerance
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Consistent Replication of Multithreaded Distributed Objects
SRDS '06 Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems
Relaxed determinism: making redundant execution on multiprocessors practical
HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
Remus: high availability via asynchronous virtual machine replication
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Multithreading Strategies for Replicated Objects
Middleware '08 Proceedings of the ACM/IFIP/USENIX 9th International Middleware Conference
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Splitter: a proxy-based approach for post-migration testing of web applications
Proceedings of the 5th European conference on Computer systems
Hi-index | 0.00 |
State-machine replication is a general approach to address the increasing importance of network-based services by improving their availability and reliability via replicated execution. If a service is deterministic, multiple replicas will produce the same results, and faults can be tolerated by means of agreement protocols. Unfortunately, real-life services are often not deterministic. One major source of non-determinism is multithreaded execution with shared data access in which the thread execution order is determined by the run-time system and the outcome may depend on which thread accesses data first. We present Storyboard, an approach that ensures deterministic execution of multi-threaded programs. Storyboard achieves this by utilizing application-specific knowledge to minimize costly inter-replica coordination and to exploit concurrency in a similar way as nondeterministic execution. This is accomplished by making a forecast for a likely execution path, provided as an ordered sequence of locks that protect critical sections. If this forecast is correct, a request is executed in parallel to other running requests without further actions. Only in case of an incorrect forecast will an alternative execution path be resolved by inter-replica coordination.