Energy proportionality and performance in data parallel computing clusters

Authors:
Jinoh Kim;Jerry Chou;Doron Rotem
Affiliations:
Lawrence Berkeley National Laboratory, University of California, Berkeley, CA;Lawrence Berkeley National Laboratory, University of California, Berkeley, CA;Lawrence Berkeley National Laboratory, University of California, Berkeley, CA
Venue:
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Year:
2011

Citing 13
Cited 0

Managing energy and server resources in hosting centers

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Power and Energy Management for Server Systems

Computer
Energy conservation in heterogeneous server clusters

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Randomized approximation algorithms for set multicover problems with applications to reverse engineering of protein and gene networks

Discrete Applied Mathematics
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
The Case for Energy-Proportional Computing

Computer
On the energy (in)efficiency of Hadoop clusters

ACM SIGOPS Operating Systems Review
An energy case for hybrid datacenters

ACM SIGOPS Operating Systems Review
Robust and flexible power-proportional storage

Proceedings of the 1st ACM symposium on Cloud computing
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Energy management for MapReduce clusters

Proceedings of the VLDB Endowment
MapReduce in the Clouds for Science

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Energy consumption in datacenters has recently become a major concern due to the rising operational costs and scalability issues. Recent solutions to this problem propose the principle of energy proportionality, i.e., the amount of energy consumed by the server nodes must be proportional to the amount of work performed. For data parallelism and fault tolerance purposes, most common file systems used in MapReduce-type clusters maintain a set of replicas for each data block. A covering set is a group of nodes that together contain at least one replica of the data blocks needed for performing computing tasks. In this work, we develop and analyze algorithms to maintain energy proportionality by discovering a covering set that minimizes energy consumption while placing the remaining nodes in low-power standby mode. Our algorithms can also discover covering sets in heterogeneous computing environments. In order to allow more data parallelism, we generalize our algorithms so that it can discover k-covering sets, i.e., a set of nodes that contain at least k replicas of the data blocks. Our experimental results show that we can achieve substantial energy saving without significant performance loss in diverse cluster configurations and working environments.