No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics

Authors:
Herodotos Herodotou;Fei Dong;Shivnath Babu
Affiliations:
Duke University;Duke University;Duke University
Venue:
Proceedings of the 2nd ACM Symposium on Cloud Computing
Year:
2011

Citing 17
Cited 14

A recursive random search algorithm for large-scale network parameter configuration

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Dynamic instrumentation of production systems

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Modeling the relative fitness of storage

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Primitives for workload summarization and implications for SQL

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Cutting the electric bill for internet-scale systems

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Tuning database configuration parameters with iTuned

Proceedings of the VLDB Endowment
HadoopToSQL: a mapReduce query optimizer

Proceedings of the 5th European conference on Computer systems
Towards automatic optimization of MapReduce programs

Proceedings of the 1st ACM symposium on Cloud computing
Data-Intensive Text Processing with MapReduce

Data-Intensive Text Processing with MapReduce
Statistical machine learning makes automatic control practical for internet datacenters

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Towards optimizing hadoop provisioning in the cloud

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
JustRunIt: experiment-based management of virtualized data centers

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
CloudCmp: shopping for a cloud made easy

HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
See spot run: using spot instances for mapreduce workflows

HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Hadoop: The Definitive Guide

Hadoop: The Definitive Guide
Automatic optimization for MapReduce programs

Proceedings of the VLDB Endowment

Jockey: guaranteed job latency in data parallel clusters

Proceedings of the 7th ACM european conference on Computer Systems
Automatic scaling of selective SPARQL joins using the TIRAMOLA system

SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
The only constant is change: incorporating time-varying network reservations in data centers

Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Transactional auto scaler: elastic scaling of in-memory transactional data grids

Proceedings of the 9th international conference on Autonomic computing
The only constant is change: incorporating time-varying network reservations in data centers

ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Bridging the tenant-provider gap in cloud services

Proceedings of the Third ACM Symposium on Cloud Computing
Building and scaling virtual clusters with residual resources from interactive clouds

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Cumulon: optimizing statistical data analysis in the cloud

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Omega: flexible, scalable schedulers for large compute clusters

Proceedings of the 8th ACM European Conference on Computer Systems
Benchmarking approach for designing a mapreduce performance model

Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Modeling I/O interference for data intensive distributed applications

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Efficient online scheduling for deadline-sensitive jobs: extended abstract

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
A vision for personalized service level agreements in the cloud

Proceedings of the Second Workshop on Data Analytics in the Cloud
ACIC: automatic cloud I/O configurator for HPC applications

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Infrastructure-as-a-Service (IaaS) cloud platforms have brought two unprecedented changes to cluster provisioning practices. First, any (nonexpert) user can provision a cluster of any size on the cloud within minutes to run her data-processing jobs. The user can terminate the cluster once her jobs complete, and she needs to pay only for the resources used and duration of use. Second, cloud platforms enable users to bypass the traditional middleman---the system administrator---in the cluster-provisioning process. These changes give tremendous power to the user, but place a major burden on her shoulders. The user is now faced regularly with complex cluster sizing problems that involve finding the cluster size, the type of resources to use in the cluster from the large number of choices offered by current IaaS cloud platforms, and the job configurations that best meet the performance needs of her workload. In this paper, we introduce the Elastisizer, a system to which users can express cluster sizing problems as queries in a declarative fashion. The Elastisizer provides reliable answers to these queries using an automated technique that uses a mix of job profiling, estimation using black-box and white-box models, and simulation. We have prototyped the Elastisizer for the Hadoop MapReduce framework, and present a comprehensive evaluation that shows the benefits of the Elastisizer in common scenarios where cluster sizing problems arise.