Modelling of computer and communication systems
Modelling of computer and communication systems
Reliable scheduling in a TMR database system
ACM Transactions on Computer Systems (TOCS)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Replicated Distributed Processing
Proceedings of the International Seminar on Networking in Open Systems
Design and development of algorithms for fault tolerant distributed systems
Design and development of algorithms for fault tolerant distributed systems
Dynamic fault tolerance in DCMA-a dynamically configurable multicomputer architecture
SRDS '96 Proceedings of the 15th Symposium on Reliable Distributed Systems
Hi-index | 0.00 |
A distributed system in which a job can be broken into a number of subjobs which are processed sequentially at various processors is considered. The performance of such a system is then compared to the replicated (triple modular redundant, or TMR) version ofthe system in which each subjob will require concurrent replicated processing with majority voting. The effect of voting times and processor failure rates on the performance of the system is investigated with analytical approximations and computer simulations.The accuracy of the former is examined. The results indicate the possible existence of a threshold voting time, below which the TMR system performs better than the unreplicated one, and above which the situation is reversed. Such thresholds are observed, where possible, in systems with repairable servers, as well as in those with nonrepairable servers.