Fault Detection for Byzantine Quorum Systems
IEEE Transactions on Parallel and Distributed Systems
Practical byzantine fault tolerance and proactive recovery
ACM Transactions on Computer Systems (TOCS)
An Architecture for Survivable Coordination in Large Distributed Systems
IEEE Transactions on Knowledge and Data Engineering
Backoff Protocols for Distributed Mutual Exclusion and Ordering
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Hi-index | 0.00 |
In this paper we explore techniques to detect Byzantine server failures in replicated data services. Our goal is to detect arbitrary failures of data servers in a system where each client accesses the replicated data at only a subset (quorum) of servers in each operation. In such a system, some correct servers can be out-of-date after a write and thus can return values other than the most up-to-date value in response to a client's read request, thus complicating the task of determining the number of faulty servers in the system at any point in time. We initiate the study of detecting server failures in this context, and propose two statistical approaches for estimating the number of faulty servers based on responses to read requests.