How to assign votes in a distributed system
Journal of the ACM (JACM)
IEEE Transactions on Computers - The MIT Press scientific computation series
Almost sure fault tolerance in random graphs
SIAM Journal on Computing
Locating faults in a constant number of parallel testing rounds
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Undirected Graph Models for System-Level Fault Diagnosis
IEEE Transactions on Computers
Complexity of Fault Diagnosis in Comparison Models
IEEE Transactions on Computers
Efficient Diagnosis of Multiprocessor Systems Under Probabilistic Models
IEEE Transactions on Computers
Intermittent Fault Diagnosis in Multiprocessor Systems
IEEE Transactions on Computers
Fault diagnosis in a small constant number of parallel testing rounds
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Probabilistic diagnosis of multiprocessor systems
ACM Computing Surveys (CSUR)
Optimal coteries and voting schemes
Information Processing Letters
Globally Optimal Diagnosis in Systems with Random Faults
IEEE Transactions on Computers
Almost Sure Diagnosis of Almost Every Good Element
IEEE Transactions on Computers
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Optimal decision strategies in Byzantine environments
Journal of Parallel and Distributed Computing
Hi-index | 14.98 |
We consider the problem of fault diagnosis in multiprocessor systems. Processors perform tests on one another; fault-free testers correctly identify the fault status of tested processors, while faulty testers can give arbitrary test results. Processors fail with arbitrary probabilities and all failures are independent. The goal is to identify correctly the status of all processors, based on the set of test results. A diagnosis algorithm is optimal if it has the highest probability of correctness (reliability) among all (deterministic) diagnosis algorithms. We give a fast diagnosis algorithm and prove its optimality for arbitrary values of failure probabilities. This is the first time that optimal diagnosis is given for systems without any assumptions on the behavior of faulty processors or on the values of failure probabilities.We also investigate locally optimal diagnosis algorithms: For any set of test results, they return the most probable configuration of faulty and fault-free processors that could yield it. We show a fast diagnosis which is always locally optimal. If all processors have failure probabilities smaller than ${\textstyle{1 \over 2}},$ a locally optimal diagnosis is proved to be optimal. However, if some processors have failure probabilities exceeding ${\textstyle{1 \over 2}},$ a locally optimal diagnosis need not have the highest reliability. We even show examples that it may have arbitrarily small reliability when the number of processors increases, while optimal reliability remains constant.