On fault resilience of OpenStack

  • Authors:
  • Xiaoen Ju;Livio Soares;Kang G. Shin;Kyung Dong Ryu;Dilma Da Silva

  • Affiliations:
  • University of Michigan;IBM T. J. Watson Research Center;University of Michigan;IBM T. J. Watson Research Center;Qualcomm Research Silicon Valley

  • Venue:
  • Proceedings of the 4th annual Symposium on Cloud Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cloud-management stacks have become an increasingly important element in cloud computing, serving as the resource manager of cloud platforms. While the functionality of this emerging layer has been constantly expanding, its fault resilience remains under-studied. This paper presents a systematic study of the fault resilience of OpenStack---a popular open source cloud-management stack. We have built a prototype fault-injection framework targeting service communications during the processing of external requests, both among OpenStack services and between OpenStack and external services, and have thus far uncovered 23 bugs in two versions of OpenStack. Our findings shed light on defects in the design and implementation of state-of-the-art cloud-management stacks from a fault-resilience perspective.