SpringFS: bridging agility and performance in elastic distributed storage

Authors:
Lianghong Xu;James Cipar;Elie Krevat;Alexey Tumanov;Nitin Gupta;Michael A. Kozuch;Gregory R. Ganger
Affiliations:
Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Intel Labs;Carnegie Mellon University
Venue:
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Year:
2014

Citing 13
Cited 0

The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
FAB: building distributed enterprise disk arrays from commodity components

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
PARAID: A gear-shifting power-aware RAID

ACM Transactions on Storage (TOS)
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Write off-loading: practical power management for enterprise storage

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Workload Analysis and Demand Prediction of Enterprise Data Center Applications

IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Making cluster applications energy-aware

ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
On the energy (in)efficiency of Hadoop clusters

ACM SIGOPS Operating Systems Review
Robust and flexible power-proportional storage

Proceedings of the 1st ACM symposium on Cloud computing
Everest: scaling down peak loads through I/O off-loading

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Sierra: practical power-proportionality for data center storage

Proceedings of the sixth conference on Computer systems
The Case for Evaluating MapReduce Performance Using Workload Suites

MASCOTS '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems
Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Elastic storage systems can be expanded or contracted to meet current demand, allowing servers to be turned off or used for other tasks. However, the usefulness of an elastic distributed storage system is limited by its agility: how quickly it can increase or decrease its number of servers. Due to the large amount of data they must migrate during elastic resizing, state-of-the-art designs usually have to make painful tradeoffs among performance, elasticity and agility. This paper describes an elastic storage system, called SpringFS, that can quickly change its number of active servers, while retaining elasticity and performance goals. SpringFS uses a novel technique, termed bounded write offloading, that restricts the set of servers where writes to overloaded servers are redirected. This technique, combined with the read offloading and passive migration policies used in SpringFS, minimizes the work needed before deactivation or activation of servers. Analysis of real-world traces from Hadoop deployments at Facebook and various Cloudera customers and experiments with the SpringFS prototype confirm SpringFS's agility, show that it reduces the amount of data migrated for elastic resizing by up to two orders of magnitude, and show that it cuts the percentage of active servers required by 67- 82%, outdoing state-of-the-art designs by 6-120%.