The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
An annotated bibliography of dependable distributed computing
ACM SIGOPS Operating Systems Review
Principal Features of the VOLTAN Family of Reliable Node Architectures for Distributed Systems
IEEE Transactions on Computers - Special issue on fault-tolerant computing
An Application of Formal Analysis to Software in a Fault-Tolerant Environment
IEEE Transactions on Computers
Containment units: a hierarchically composable architecture for adaptive systems
Proceedings of the 10th ACM SIGSOFT symposium on Foundations of software engineering
Containment units: a hierarchically composable architecture for adaptive systems
ACM SIGSOFT Software Engineering Notes
Encyclopedia of Computer Science
Hi-index | 0.00 |
A design approach developed over the past few years to formalize redundancy management and validation is described. Redundant elements are partitioned into individual fault-containment regions (FCRs). An FCR is a collection of components that operates correctly regardless of any arbitrary logical or electrical fault outside the region. Conversely, a fault in an FCR cannot cause hardware outside the region to fail. The outputs of all channels are required to agree bit-for-bit under no-fault conditions (exact bitwise consensus). Synchronization, input agreement, and input validity conditions are discussed. The Advanced Information Processing System (AIPS), which is a fault-tolerant distributed architecture based on this approach, is described. A brief overview of recent applications of these systems and current research is presented.