Gecko: contention-oblivious disk arrays for cloud storage

Authors:
Ji-Yong Shin;Mahesh Balakrishnan;Tudor Marian;Hakim Weatherspoon
Affiliations:
Cornell University;Microsoft Research;Google;Cornell University
Venue:
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Year:
2013

Citing 21
Cited 0

A case for redundant arrays of inexpensive disks (RAID)

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
The design and implementation of a log-structured file system

ACM Transactions on Computer Systems (TOCS)
The logical disk: a new approach to improving file systems

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
The Zebra striped network file system

ACM Transactions on Computer Systems (TOCS)
The HP AutoRAID hierarchical storage system

ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Petal: distributed virtual disks

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improving the performance of log-structured file systems with adaptive methods

Proceedings of the sixteenth ACM symposium on Operating systems principles
A performance comparison of RAID-5 and log-structured arrays

HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
An implementation of a log-structured file system for UNIX

USENIX'93 Proceedings of the USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993 Conference Proceedings
File system logging versus clustering: a performance comparison

TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
AFRAID: a frequently redundant array of independent disks

ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Antiquity: exploiting a secure log for wide-area distributed storage

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Parallax: virtual disks for virtual machines

Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Design tradeoffs for SSD performance

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
PARDA: proportional allocation of resources for distributed storage access

FAST '09 Proccedings of the 7th conference on File and storage technologies
Lithium: virtual machine storage for the cloud

Proceedings of the 1st ACM symposium on Cloud computing
Extending SSD lifetimes with disk-based write caches

FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Pesto: online storage performance management in virtualized datacenters

Proceedings of the 2nd ACM Symposium on Cloud Computing
Windows Azure Storage: a highly available cloud storage service with strong consistency

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
CORFU: a shared log design for flash clusters

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Gecko: a contention-oblivious design for cloud storage

HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Disk contention is increasingly a significant problem for cloud storage, as applications are forced to co-exist on machines and share physical disk resources. Disks are notoriously sensitive to contention; a single application's random I/O is sufficient to reduce the throughput of a disk array by an order of magnitude, disrupting every other application running on the same array. Log-structured storage designs can alleviate write-write contention between applications by sequentializing all writes, but have historically suffered from read-write contention triggered by garbage collection (GC) as well as application reads. Gecko is a novel log-structured design that eliminates read-write contention by chaining together a small number of drives into a single log, effectively separating the tail of the log (where writes are appended) from its body. As a result, writes proceed to the tail drive without contention from either GC reads or first-class reads, which are restricted to the body of the log with the help of a tail-specific caching policy. Gecko trades-off maximum contention-free sequential throughput from multiple drives in exchange for a stable and predictable maximum throughput from a single uncontended drive, and achieves better performance compared to native log-structured or RAID based systems for most cases. Our in-kernel implementation provides random write bandwidth to applications of 60 to 120MB/s, despite concurrent GC activity, application reads, and an adversarial workload.