Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data

  • Authors:
  • Benny Van Houdt

  • Affiliations:
  • -

  • Venue:
  • Performance Evaluation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

To avoid a poor random write performance, flash-based solid state drives typically rely on an internal log-structure. This log-structure reduces the write amplification and thereby improves the write throughput and extends the drive's lifespan. In this paper, we analyze the performance of the log-structure combined with the d-choices garbage collection algorithm, which repeatedly selects the block with the fewest number of valid pages out of a set of d randomly chosen blocks, and consider non-uniform random write workloads. Using a mean field model, we show that the write amplification worsens as the hot data gets hotter. Next, we introduce the double log-structure, which uses a separate log for internal and external write requests. Although the double log-structure performs identically to the single log-structure under uniform random writes, we show that it drastically reduces the write amplification of the d-choices algorithm in the presence of hot data. In other words, the double log-structure yields an automatic form of data separation. Further, with the double log-structure there exists an optimal value for d (typically around 10), meaning the greedy garbage collection algorithm is no longer optimal. Finally, both mean field models introduced in this paper are validated using simulation experiments.