Transparent fault tolerance for distributed Ada applications

Authors:
Mark A. Breland;Steven A. Rogers;Guillaume P. Brat;Kenneth L. Nelson
Affiliations:
Microelectronics and Computer Technology Corporation (MCC), 3500 West Balcones Center Drive, Austin, Texas;Microelectronics and Computer Technology Corporation (MCC), 3500 West Balcones Center Drive, Austin, Texas;Department of Electrical and Computer, Engineering, The University of Texas at Austin, Austin, Texas;Computing Devices International, 8800 Queen Avenue South, Bloomington, Minnesota
Venue:
TRI-Ada '94 Proceedings of the conference on TRI-Ada '94
Year:
1994

Citing 8
Cited 2

The fault-tolerant multiprocessor computer

The fault-tolerant multiprocessor computer
On the Implementation and Use of Ada on Fault-Tolerant Distributed Systems

IEEE Transactions on Software Engineering
Ada Program Partitioning Language: A Notion for Distributing Ada Programs

IEEE Transactions on Software Engineering
The use of Ada to achieve fault tolerance in AAS

TRI-Ada '92 Proceedings of the conference on TRI-Ada '92
The consensus problem in fault-tolerant computing

ACM Computing Surveys (CSUR)
Demonstrable fault tolerance for distributed Ada

TRI-Ada '93 Proceedings of the conference on TRI-Ada '93
Exploiting Replication

Exploiting Replication
Fast Causal Multicast

Fast Causal Multicast

Programming distributed fault tolerant systems: the replicAda approach

Proceedings of the conference on TRI-Ada '97
Redistribution in distributed Ada

Proceedings of the 1999 annual ACM SIGAda international conference on Ada

Quantified Score

Hi-index	0.00

Visualization

Abstract

The advent of open architectures and initiatives in massively parallel supercomputing, combined with the maturation of distributed processing methods and algorithms, has enabled the implementation of responsive software-based fault tolerance. Expanding capabilities of distributed Ada runtime environments further stimulate the incorporation of hardware fault tolerance into critical, realtime embedded systems. Through the integration of proven Ada program component distribution and virtually synchronous communication protocols, we have established a benchmark fault tolerant system, which layers transparently between an Ada application and the runtime environment. Such transparence allows rapid reconfiguration of distribution and fault tolerance characteristics without change to the source code, thus enhancing portability, scalability, and reuse.The Ada Fault Tolerance project has implemented software technologies which penetrate the envelope of an Ada program to detect, diagnose, and recover from hardware faults. These realtime facilities interact with the Rational distributed application development and runtime environment systems to service replicated Ada software tasks (i.e., threads of control). The deployed system proves that all replicated threads, including those of independently distributed components, can achieve timely consensus during periodic fault detection cycles through transparently embedded voting protocols. Our implementation uses a hybrid redundancy computation strategy and relies on a communication layer which provides virtual synchrony via a causal multicast protocol.