ZZ and the art of practical BFT execution

  • Authors:
  • Timothy Wood;Rahul Singh;Arun Venkataramani;Prashant Shenoy;Emmanuel Cecchet

  • Affiliations:
  • University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA

  • Venue:
  • Proceedings of the sixth conference on Computer systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

The high replication cost of Byzantine fault-tolerance (BFT) methods has been a major barrier to their widespread adoption in commercial distributed applications. We present ZZ, a new approach that reduces the replication cost of BFT services from 2f+1 to practically f+1. The key insight in ZZ is to use f+1 execution replicas in the normal case and to activate additional replicas only upon failures. In data centers where multiple applications share a physical server, ZZ reduces the aggregate number of execution replicas running in the data center, improving throughput and response times. ZZ relies on virtualization---a technology already employed in modern data centers---for fast replica activation upon failures, and enables newly activated replicas to immediately begin processing requests by fetching state on-demand. A prototype implementation of ZZ using the BASE library and Xen shows that, when compared to a system with 2f+1 replicas, our approach yields lower response times and up to 33% higher throughput in a prototype data center with four BFT web applications. We also show that ZZ can handle simultaneous failures and achieve sub-second recovery.