Gecko: contention-oblivious disk arrays for cloud storage

  • Authors:
  • Ji-Yong Shin;Mahesh Balakrishnan;Tudor Marian;Hakim Weatherspoon

  • Affiliations:
  • Cornell University;Microsoft Research;Google;Cornell University

  • Venue:
  • FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Disk contention is increasingly a significant problem for cloud storage, as applications are forced to co-exist on machines and share physical disk resources. Disks are notoriously sensitive to contention; a single application's random I/O is sufficient to reduce the throughput of a disk array by an order of magnitude, disrupting every other application running on the same array. Log-structured storage designs can alleviate write-write contention between applications by sequentializing all writes, but have historically suffered from read-write contention triggered by garbage collection (GC) as well as application reads. Gecko is a novel log-structured design that eliminates read-write contention by chaining together a small number of drives into a single log, effectively separating the tail of the log (where writes are appended) from its body. As a result, writes proceed to the tail drive without contention from either GC reads or first-class reads, which are restricted to the body of the log with the help of a tail-specific caching policy. Gecko trades-off maximum contention-free sequential throughput from multiple drives in exchange for a stable and predictable maximum throughput from a single uncontended drive, and achieves better performance compared to native log-structured or RAID based systems for most cases. Our in-kernel implementation provides random write bandwidth to applications of 60 to 120MB/s, despite concurrent GC activity, application reads, and an adversarial workload.