FATE and DESTINI: a framework for cloud recovery testing

  • Authors:
  • Haryadi S. Gunawi;Thanh Do;Pallavi Joshi;Peter Alvaro;Joseph M. Hellerstein;Andrea C. Arpaci-Dusseau;Remzi H. Arpaci-Dusseau;Koushik Sen;Dhruba Borthakur

  • Affiliations:
  • University of California, Berkeley;University of Wisconsin, Madison;University of California, Berkeley;University of California, Berkeley;University of California, Berkeley;University of Wisconsin, Madison;University of Wisconsin, Madison;University of California, Berkeley;Facebook

  • Venue:
  • Proceedings of the 8th USENIX conference on Networked systems design and implementation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the cloud era begins and failures become commonplace, failure recovery becomes a critical factor in the availability, reliability and performance of cloud services. Unfortunately, recovery problems still take place, causing downtimes, data loss, and many other problems. We propose a new testing framework for cloud recovery: FATE (Failure Testing Service) and DESTINI (Declarative Testing Specifications). With FATE, recovery is systematically tested in the face of multiple failures. With DESTINI, correct recovery is specified clearly, concisely, and precisely. We have integrated our framework to several cloud systems (e.g., HDFS [33]), explored over 40,000 failure scenarios, wrote 74 specifications, found 16 new bugs, and reproduced 51 old bugs.