Evaluating quorum systems over the Internet

  • Authors:
  • Y. Amir;A. Wool

  • Affiliations:
  • -;-

  • Venue:
  • FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Quorum systems serve as a basic tool providing a uniform and reliable way to achieve coordination in a distributed system. They are useful for distributed and replicated databases, name servers, mutual exclusion, and distributed access control and signatures. Traditionally, two basic methods have been used to evaluate quorum systems: the analytical approach, and simulation. We propose a third, empirical approach. We collected 6 months' worth of connectivity and operability data of a system consisting of 14 real computers using a wide area group communication protocol. The system spanned two geographic sites and three different Internet segments. We developed a mechanism that merges the local views into a unified history of the events that took place, ordered according to an imaginary global clock. We then developed a tool called the Generic Quorum-system Evaluator (GQE), which evaluates the behavior of any given quorum system over the unified, real-life history. We compared fourteen dynamic and static quorum systems. We discovered that as predicted, dynamic quorum systems behave better than static systems. However we found that many assumptions taken by the traditional approaches are unjustified: crashes are strongly correlated, network partitions do occur even within a single Internet segment, and we even detected a brief simultaneous crash of all the participating computers.