AMp: a highly parallel atomic multicast protocol
SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Understanding fault-tolerant distributed systems
Communications of the ACM
Unreliable failure detectors for asynchronous systems (preliminary version)
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
Making real-time reactive systems reliable
ACM SIGOPS Operating Systems Review
ACM Transactions on Information and System Security (TISSEC)
Making real-time reactive systems reliable
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
A formal approach to fault-tolerance in distributed real-time systems
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
Towards a formal framework for fault-tolerance
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
A new class of nature-inspired algorithms for self-adaptive peer-to-peer computing
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Intrusion-tolerant architectures: concepts and design
Architecting dependable systems
Transactions for distributed wikis on structured overlays
DSOM'07 Proceedings of the Distributed systems: operations and management 18th IFIP/IEEE international conference on Managing virtualization of networks and services
Using selective acknowledgements to reduce the memory footprint of replicated services
ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part II
Granola: low-overhead distributed transaction coordination
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Hi-index | 0.02 |
The state machine approach is a general method for achieving fault tolerance and implementing decentralized control in distributed systems. This paper reviews the approach and identifies abstractions needed for coordinating ensembles of state machines. Implementations of these abstractions for two different failure models -Byzantine and fail-stop-are discussed. The state machine approach is illustrated by programming several examples. Optimization and system reconfiguration techniques are explained.