The fault-tolerant multiprocessor computer
The fault-tolerant multiprocessor computer
On the Implementation and Use of Ada on Fault-Tolerant Distributed Systems
IEEE Transactions on Software Engineering
Ada Program Partitioning Language: A Notion for Distributing Ada Programs
IEEE Transactions on Software Engineering
The use of Ada to achieve fault tolerance in AAS
TRI-Ada '92 Proceedings of the conference on TRI-Ada '92
The consensus problem in fault-tolerant computing
ACM Computing Surveys (CSUR)
Demonstrable fault tolerance for distributed Ada
TRI-Ada '93 Proceedings of the conference on TRI-Ada '93
Exploiting Replication
Fast Causal Multicast
Programming distributed fault tolerant systems: the replicAda approach
Proceedings of the conference on TRI-Ada '97
Redistribution in distributed Ada
Proceedings of the 1999 annual ACM SIGAda international conference on Ada
Hi-index | 0.00 |
The advent of open architectures and initiatives in massively parallel supercomputing, combined with the maturation of distributed processing methods and algorithms, has enabled the implementation of responsive software-based fault tolerance. Expanding capabilities of distributed Ada runtime environments further stimulate the incorporation of hardware fault tolerance into critical, realtime embedded systems. Through the integration of proven Ada program component distribution and virtually synchronous communication protocols, we have established a benchmark fault tolerant system, which layers transparently between an Ada application and the runtime environment. Such transparence allows rapid reconfiguration of distribution and fault tolerance characteristics without change to the source code, thus enhancing portability, scalability, and reuse.The Ada Fault Tolerance project has implemented software technologies which penetrate the envelope of an Ada program to detect, diagnose, and recover from hardware faults. These realtime facilities interact with the Rational distributed application development and runtime environment systems to service replicated Ada software tasks (i.e., threads of control). The deployed system proves that all replicated threads, including those of independently distributed components, can achieve timely consensus during periodic fault detection cycles through transparently embedded voting protocols. Our implementation uses a hybrid redundancy computation strategy and relies on a communication layer which provides virtual synchrony via a causal multicast protocol.