Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Hardware and software fault tolerance in parallel computing systems
Replication management using the state-machine approach
Distributed systems (2nd Ed.)
Distributed systems (2nd Ed.)
Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism
Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism
Distributed Systems - Architecture and Implementation, An Advanced Course
Hi-index | 0.00 |
Many replica management approaches have been developed in the literature ncluding the state machine approach and the primary/back-up approach. In these approaches, applications are replicated on different processing elements, each instance of the replicated application is called a replica. Replicas cooperate to provide fault tolerance. In this article, we introduce a new replica management approach for implementing embedded fault tolerant systems based on the functional programming paradigm. The proposed approach, called active parallel replication, takes advantages of the redundancy existing in the system and on the properties of the functional programming paradigm, such as referential transparency, not only to provide fault tolerance but also to improve system performance and to reduce the computation costs of replica management and synchronization.