Parsimony-Based approach for obtaining resource-efficient and trustworthy execution

  • Authors:
  • HariGovind V. Ramasamy;Adnan Agbaria;William H. Sanders

  • Affiliations:
  • Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL;Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL;Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • LADC'05 Proceedings of the Second Latin-American conference on Dependable Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a resource-efficient way to execute requests in Byzantine-fault-tolerant replication that is particularly well-suited for services in which request processing is resource-intensive. Previous efforts took a failure-masking all-active approach of using all 2t + 1 execution replicas to execute all requests, where t is the maximum number of failures tolerated. We describe an asynchronous execution protocol that combines failure masking with imperfect failure detection and checkpointing. Our protocol is parsimony-based since it uses only t + 1 execution replicas, called the primary committee or pc, to execute the requests normally. Under normal conditions, characterized by a stable network and no misbehavior by pc replicas, our approach enables a trustworthy reply to be obtained with the same latency as in the all-active approach, but with only about half of the overall resource use of the all-active approach. However, a request that exposes faults among the pc replicas will incur a higher latency than the all-active approach mainly due to fault detection latency. Under such conditions, the protocol switches to a recovery mode, in which all 2t + 1 replicas execute the request and send their replies. Then, after selecting a new pc, the request latency returns to the same level as that of all-active execution. Practical observations point to the fact that failures and instability are the exception rather than the norm. That motivated our decision to optimize resource efficiency for the common case, even if it means paying a slightly higher performance cost during periods of instability.