Journal of the ACM (JACM)
Production and Stabilization of Real-Time Task Schedules
Journal of the ACM (JACM)
Reaching Agreement in the Presence of Faults
Journal of the ACM (JACM)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Formal verification of algorithms for critical systems
SIGSOFT '91 Proceedings of the conference on Software for citical systems
An annotated bibliography of dependable distributed computing
ACM SIGOPS Operating Systems Review
Task Allocation for Maximizing Reliability of Distributed Computer Systems
IEEE Transactions on Computers
Traffic Routing for Multicomputer Networks with Virtual Cut-Through Capability
IEEE Transactions on Computers
A formally verified algorithm for clock synchronization under a hybrid fault model
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Formal Verification for Fault-Tolerant Architectures: Prolegomena to the Design of PVS
IEEE Transactions on Software Engineering
New Hybrid Fault Models for Asynchronous Approximate Agreement
IEEE Transactions on Computers
Implementing Fail-Silent Nodes for Distributed Systems
IEEE Transactions on Computers
Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Formally Verified On-Line Diagnosis
IEEE Transactions on Software Engineering
Stability and Performance of List Scheduling With ExternalProcess Delays
Real-Time Systems
Overload Management in Real-Time Control Applications Using m,k $(m,k)$-Firm Guarantee
IEEE Transactions on Parallel and Distributed Systems
GUARDS: A Generic Upgradable Architecture for Real-Time Dependable Systems
IEEE Transactions on Parallel and Distributed Systems
Replica Determinism and Flexible Scheduling in Hard Real-Time Dependable Systems
IEEE Transactions on Computers
Exploiting Omissive Faults in Synchronous Approximate Agreement
IEEE Transactions on Computers
Inherently Stable Real-Time Priority List Dispatchers
IEEE Parallel & Distributed Technology: Systems & Technology
Reaching Approximate Agreement with Mixed-Mode Faults
IEEE Transactions on Parallel and Distributed Systems
Formal Verification of Algorithms for Critical Systems
IEEE Transactions on Software Engineering
The customizable fault/error model for dependable distributed systems
Theoretical Computer Science - Dependable computing
Transparent Environment for Replicated Ravenscar Applications
Ada-Europe '02 Proceedings of the 7th Ada-Europe International Conference on Reliable Software Technologies
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Reconfiguration and transient recovery in state machine architectures
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Efficient NMRCD scheme for fault tolerance in responsive systems
RTCSA '95 Proceedings of the 2nd International Workshop on Real-Time Computing Systems and Applications
Specialized N-modular redundant processors in large-scale distributed systems
SRDS '96 Proceedings of the 15th Symposium on Reliable Distributed Systems
A Consensus Protocol for CAN-Based Systems
RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
Replication Management in Reliable Real-Time Systems
Real-Time Systems
STACS'99 Proceedings of the 16th annual conference on Theoretical aspects of computer science
Fault-models in wireless communication: towards survivable ad hoc networks
MILCOM'06 Proceedings of the 2006 IEEE conference on Military communications
Scheduling fixed-priority hard real-time tasks in the presence of faults
LADC'05 Proceedings of the Second Latin-American conference on Dependable Computing
A framework for ensuring and improving dependability in highly distributed systems
Architecting Dependable Systems III
A decentralized redeployment algorithm for improving the availability of distributed systems
CD'05 Proceedings of the Third international working conference on Component Deployment
Hi-index | 0.04 |
A description is given of the multicomputer architecture for fault tolerance (MAFT), a distributed system designed to provide extremely reliable computation in real-time control systems. MAFT is based on the physical and functional partitioning of executive functions from applications functions. The implementation of the executive functions in a special-purpose hardware processor allows the fault-tolerance functions to be transparent to the application programs and minimizes overhead. Byzantine agreement and approximate agreement algorithms are used for critical system parameters. MAFT supports the use of multiversion hardware and software to tolerate built-in or generic faults. Graceful degradation and restoration of the application workload is permitted in response to the exclusion and readmission of nodes, respectively.