A guided tour of Chernoff bounds
Information Processing Letters
Hierarchical Quorum Consensus: A New Algorithm for Managing Replicated Data
IEEE Transactions on Computers
A Majority consensus approach to concurrency control for multiple copy databases
ACM Transactions on Database Systems (TODS)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Probability and Statistics with Reliability, Queuing and Computer Science Applications
Probability and Statistics with Reliability, Queuing and Computer Science Applications
A Fault-Tolerant Algorithm for Replicated Data Management
Proceedings of the Eighth International Conference on Data Engineering
A principle for resilient sharing of distributed resources
ICSE '76 Proceedings of the 2nd international conference on Software engineering
Hi-index | 14.98 |
k-resilient protocols are used in some parallel and distributed system applications for increased availability of resources. A protocol running on an n site system is k resilient if it could tolerate up to k failures and operate correctly. The reliability of such a protocol is defined as the probability that no more than k sites have failed. Such a k-resilient protocol is beneficial only when its reliability is greater than the reliability of a protocol running on a system with a single site. We consider k-resilient protocols and develop a general technique for approximately computing the time until which these protocols have higher reliability than protocols running on single site systems. We call this time the reliability interval. Our general techniques for computing the reliability interval can be used irrespective of the type of failure distribution (with respect to time) of the sites of the system. We use experimental results to validate our technique.