Reliable communication in the presence of failures
ACM Transactions on Computer Systems (TOCS)
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Fault tolerance in distributed systems
Fault tolerance in distributed systems
Foundations of Parallel Programming: A Machine-Indepedent Approach
Foundations of Parallel Programming: A Machine-Indepedent Approach
Programming Language Essentials
Programming Language Essentials
Roll-Forward Checkpointing Scheme: A Novel Fault-Tolerant Architecture
IEEE Transactions on Computers
A Time Redundancy Approach to TMR Failures Using Fault-State Likelihoods
IEEE Transactions on Computers
HFP: A hierarchical and functional programming based on attribute grammar
ICSE '81 Proceedings of the 5th international conference on Software engineering
The N-Version Approach to Fault-Tolerant Software
IEEE Transactions on Software Engineering
Hi-index | 0.00 |
Presents a replication technique based on the FTAG (fault-tolerant attribute grammar) computation model, where instances of a replicated application are active on different groups of processors called replicas. FTAG is a functional and attribute-based model. The developed replication technique implements "active parallel replication", i.e. all replicas are active and concurrently compute a different piece of the application's parallel code. In our model, replicas cooperate not only to detect and mask failures but also to perform parallel computation. The replication mechanisms are supported by the FTAG run-time system and are fully application-transparent. Different novel mechanisms for checkpointing and recovery are developed. Rollback is achieved only if the system experiences multiple failures, otherwise forward recovery is performed. The replication technique takes full advantage of parallel computation to reduce the computation time.