AROMA: automated resource allocation and configuration of mapreduce environment in the cloud

Authors:
Palden Lama;Xiaobo Zhou
Affiliations:
University of Colorado at Colorado Springs, Colorado Springs, CO, USA;University of Colorado at Colorado Springs, Colorado Springs, CO, USA
Venue:
Proceedings of the 9th international conference on Autonomic computing
Year:
2012

Citing 23
Cited 2

MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
VCONF: a reinforcement learning approach to virtual machines auto-configuration

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
A Reinforcement Learning Approach to Online Web Systems Auto-configuration

ICDCS '09 Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads

Proceedings of the VLDB Endowment
Efficient resource provisioning in compute clouds via VM multiplexing

Proceedings of the 7th international conference on Autonomic computing
Autonomic mix-aware provisioning for non-stationary data center workloads

Proceedings of the 7th international conference on Autonomic computing
Towards optimizing hadoop provisioning in the cloud

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Autonomic Provisioning with Self-Adaptive Neural Fuzzy Control for End-to-end Delay Guarantee

MASCOTS '10 Proceedings of the 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Hadoop++: making a yellow elephant run like a cheetah (without it even noticing)

Proceedings of the VLDB Endowment
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Mesos: a platform for fine-grained resource sharing in the data center

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud

IEEE Transactions on Parallel and Distributed Systems
PERFUME: power and performance guarantee with fuzzy MIMO control in virtualized servers

Proceedings of the Nineteenth International Workshop on Quality of Service
A multi-objective approach to virtual machine management in datacenters

Proceedings of the 8th ACM international conference on Autonomic computing
ARIA: automatic resource inference and allocation for mapreduce environments

Proceedings of the 8th ACM international conference on Autonomic computing
Economical and Robust Provisioning of N-Tier Cloud Workloads: A Multi-level Control Approach

ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
YSmart: Yet Another SQL-to-MapReduce Translator

ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
Intelligent Placement of Datacenters for Internet Services

ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
Location-Aware MapReduce in Virtual Cloud

ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
S3: An Efficient Shared Scan Scheduler on MapReduce Framework

ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Heterogeneity-aware resource allocation and scheduling in the cloud

HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing

Interference and locality-aware task scheduling for MapReduce applications in virtual clusters

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Mammoth: autonomic data processing framework for scientific state-transition applications

Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed data processing framework MapReduce is increasingly deployed in Clouds to leverage the pay-per-usage cloud computing model. Popular Hadoop MapReduce environment expects that end users determine the type and amount of Cloud resources for reservation as well as the configuration of Hadoop parameters. However, such resource reservation and job provisioning decisions require in-depth knowledge of system internals and laborious but often ineffective parameter tuning. We propose and develop AROMA, a system that automates the allocation of heterogeneous Cloud resources and configuration of Hadoop parameters for achieving quality of service goals while minimizing the incurred cost. It addresses the significant challenge of provisioning ad-hoc jobs that have performance deadlines in Clouds through a novel two-phase machine learning and optimization framework. Its technical core is a support vector machine based performance model that enables the integration of various aspects of resource provisioning and auto-configuration of Hadoop jobs. It adapts to ad-hoc jobs by robustly matching their resource utilization signature with previously executed jobs and making provisioning decisions accordingly. We implement AROMA as an automated job provisioning system for Hadoop MapReduce hosted in virtualized HP ProLiant blade servers. Experimental results show AROMA's effectiveness in providing performance guarantee of diverse Hadoop benchmark jobs while minimizing the cost of Cloud resource usage.