SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
FAB: building distributed enterprise disk arrays from commodity components
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
PARAID: A gear-shifting power-aware RAID
ACM Transactions on Storage (TOS)
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Write off-loading: practical power management for enterprise storage
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Workload Analysis and Demand Prediction of Enterprise Data Center Applications
IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Making cluster applications energy-aware
ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
On the energy (in)efficiency of Hadoop clusters
ACM SIGOPS Operating Systems Review
Robust and flexible power-proportional storage
Proceedings of the 1st ACM symposium on Cloud computing
Everest: scaling down peak loads through I/O off-loading
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Sierra: practical power-proportionality for data center storage
Proceedings of the sixth conference on Computer systems
The Case for Evaluating MapReduce Performance Using Workload Suites
MASCOTS '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems
Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Elastic storage systems can be expanded or contracted to meet current demand, allowing servers to be turned off or used for other tasks. However, the usefulness of an elastic distributed storage system is limited by its agility: how quickly it can increase or decrease its number of servers. Due to the large amount of data they must migrate during elastic resizing, state-of-the-art designs usually have to make painful tradeoffs among performance, elasticity and agility. This paper describes an elastic storage system, called SpringFS, that can quickly change its number of active servers, while retaining elasticity and performance goals. SpringFS uses a novel technique, termed bounded write offloading, that restricts the set of servers where writes to overloaded servers are redirected. This technique, combined with the read offloading and passive migration policies used in SpringFS, minimizes the work needed before deactivation or activation of servers. Analysis of real-world traces from Hadoop deployments at Facebook and various Cloudera customers and experiments with the SpringFS prototype confirm SpringFS's agility, show that it reduces the amount of data migrated for elastic resizing by up to two orders of magnitude, and show that it cuts the percentage of active servers required by 67- 82%, outdoing state-of-the-art designs by 6-120%.