Understanding the robustness of SSDS under power fault

  • Authors:
  • Mai Zheng;Joseph Tucek;Feng Qin;Mark Lillibridge

  • Affiliations:
  • The Ohio State University;HP Labs;The Ohio State University;HP Labs

  • Venue:
  • FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern storage technology (SSDs, No-SQL databases, commoditized RAID hardware, etc.) bring new reliability challenges to the already complicated storage stack. Among other things, the behavior of these new components during power faults--which happen relatively frequently in data centers--is an important yet mostly ignored issue in this dependability-critical area. Understanding how new storage components behave under power fault is the first step towards designing new robust storage systems. In this paper, we propose a new methodology to expose reliability issues in block devices under power faults. Our framework includes specially-designed hardware to inject power faults directly to devices, workloads to stress storage components, and techniques to detect various types of failures. Applying our testing framework, we test fifteen commodity SSDs from five different vendors using more than three thousand fault injection cycles in total. Our experimental results reveal that thirteen out of the fifteen tested SSD devices exhibit surprising failure behaviors under power faults, including bit corruption, shorn writes, unserializable writes, metadata corruption, and total device failure.