Fault Detection for Byzantine Quorum Systems

  • Authors:
  • Lorenzo Alvisi;Dahlia Malkhi;Evelyn Pierce;Michael K. Reiter

  • Affiliations:
  • Univ. of Texas, Austin;The Hebrew Univ. of Jerusalem, Israel;Laboratorie de Systems d'Exploitation, Lausanne, Switzerland;Bell Laboratories, Murray Hall, NJ

  • Venue:
  • IEEE Transactions on Parallel and Distributed Systems
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we explore techniques to detect Byzantine server failures in asynchronous replicated data services. Our goal is to detect arbitrary failures of data servers in a system where each client accesses the replicated data at only a subset (quorum) of servers in each operation. In such a system, some correct servers can be out-of-date after a write and can therefore, return values other than the most up-to-date value in response to a client's read request, thus complicating the task of determining the number of faulty servers in the system at any point in time. We initiate the study of detecting server failures in this context, and propose two statistical approaches for estimating the risk posed by faulty servers based on responses to read requests.