Harmonia: A globally coordinated garbage collector for arrays of Solid-State Drives

  • Authors:
  • Youngjae Kim;Sarp Oral;Galen M. Shipman; Junghee Lee;David A. Dillow; Feiyi Wang

  • Affiliations:
  • National Center for Computational Sciences, Oak Ridge National Laboratory, TN 37831-6016, USA;National Center for Computational Sciences, Oak Ridge National Laboratory, TN 37831-6016, USA;National Center for Computational Sciences, Oak Ridge National Laboratory, TN 37831-6016, USA;National Center for Computational Sciences, Oak Ridge National Laboratory, TN 37831-6016, USA;National Center for Computational Sciences, Oak Ridge National Laboratory, TN 37831-6016, USA;National Center for Computational Sciences, Oak Ridge National Laboratory, TN 37831-6016, USA

  • Venue:
  • MSST '11 Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Solid-State Drives (SSDs) offer significant performance improvements over hard disk drives (HDD) on a number of workloads. The frequency of garbage collection (GC) activity is directly correlated with the pattern, frequency, and volume of write requests, and scheduling of GC is controlled by logic internal to the SSD. SSDs can exhibit significant performance degradations when garbage collection (GC) conflicts with an ongoing I/O request stream. When using SSDs in a RAID array, the lack of coordination of the local GC processes amplifies these performance degradations. No RAID controller or SSD available today has the technology to overcome this limitation. This paper presents Harmonia, a Global Garbage Collection (GGC) mechanism to improve response times and reduce performance variability for a RAID array of SSDs. Our proposal includes a high-level design of SSD-aware RAID controller and GGC-capable SSD devices, as well as algorithms to coordinate the global GC cycles. Our simulations show that this design improves response time and reduces performance variability for a wide variety of enterprise workloads. For bursty, write dominant workloads response time was improved by 69% while performance variability was reduced by 71%.