Fault-Tolerant Software for Real-Time Applications
ACM Computing Surveys (CSUR)
Reliability Issues in Computing System Design
ACM Computing Surveys (CSUR)
The “worm” programs—early experience with a distributed computation
Communications of the ACM
A fault tolerant computer system for simulation of complex systems
ANSS '86 Proceedings of the 19th annual symposium on Simulation
Hi-index | 0.00 |
The properties of a fault tolerant computer system based on a hexagonal grid of processing elements (called the FMPA system) is investigated through discrete event simulation. An hypothetical test environment is used to investigate the robustness of the system, and to study the sensitivity of the system to processor and bus speeds, to the assignment of tasks to the processors, and the performance of the system after component failure. The system, at least as tested, is remarkably robust, and even seems to perform better in the face of moderate component failure. A description of the simulation program is included, as are suggestions for further research.