GreenHadoop: leveraging green energy in data-processing frameworks

Authors:
Íñigo Goiri;Kien Le;Thu D. Nguyen;Jordi Guitart;Jordi Torres;Ricardo Bianchini
Affiliations:
Rutgers University, Piscataway, NJ, USA;Rutgers University, Piscataway, NJ, USA;Rutgers University, Piscataway, NJ, USA;Universitat Politècnica de Catalunya/Barcelona Supercomputing Center, Barcelona, Spain;Universitat Politècnica de Catalunya/Barcelona Supercomputing Center, Barcelona, Spain;Rutgers University, Piscataway, NJ, USA
Venue:
Proceedings of the 7th ACM european conference on Computer Systems
Year:
2012

Citing 18
Cited 8

Ensemble-level Power Management for Dense Blade Servers

Proceedings of the 33rd annual international symposium on Computer Architecture
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Cutting the electric bill for internet-scale systems

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
On the energy (in)efficiency of Hadoop clusters

ACM SIGOPS Operating Systems Review
Robust and flexible power-proportional storage

Proceedings of the 1st ACM symposium on Cloud computing
Capping the brown energy consumption of Internet services at low cost

GREENCOMP '10 Proceedings of the International Conference on Green Computing
Energy management for MapReduce clusters

Proceedings of the VLDB Endowment
Evaluation and Analysis of GreenHDFS: A Self-Adaptive, Energy-Conserving Variant of the Hadoop Distributed File System

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Blink: managing server clusters on intermittent power

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Free lunch: exploiting renewable energy for computing

HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Greening geographical load balancing

Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Benefits and limitations of tapping into stored energy for datacenters

Proceedings of the 38th annual international symposium on Computer architecture
SolarCore: Solar energy driven multi-core architecture power management

HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Utilizing green energy prediction to schedule mixed batch and service jobs in data centers

HotPower '11 Proceedings of the 4th Workshop on Power-Aware Computing and Systems
Willow: A Control System for Energy and Thermal Adaptive Computing

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
GreenSlot: scheduling energy consumption in green datacenters

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Reducing electricity cost through virtual machine placement in high performance computing clouds

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel job scheduling — a status report

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing

Leveraging renewable energy in data centers: present and future

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Aggressive Datacenter Power Provisioning with Batteries

ACM Transactions on Computer Systems (TOCS)
Yank: enabling green data centers to pull the plug

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
MultiGreen: cost-minimizing multi-source datacenter power supply with online control

Proceedings of the fourth international conference on Future energy systems
Greening the compute cloud's pricing plans

Proceedings of the Workshop on Power-Aware Computing and Systems
Data center demand response: Avoiding the coincident peak via workload shifting and local generation

Performance Evaluation
Enabling datacenter servers to scale out economically and sustainably

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Review: A survey on architectures and energy efficiency in Data Center Networks

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Interest has been growing in powering datacenters (at least partially) with renewable or "green" sources of energy, such as solar or wind. However, it is challenging to use these sources because, unlike the "brown" (carbon-intensive) energy drawn from the electrical grid, they are not always available. This means that energy demand and supply must be matched, if we are to take full advantage of the green energy to minimize brown energy consumption. In this paper, we investigate how to manage a datacenter's computational workload to match the green energy supply. In particular, we consider data-processing frameworks, in which many background computations can be delayed by a bounded amount of time. We propose GreenHadoop, a MapReduce framework for a datacenter powered by a photovoltaic solar array and the electrical grid (as a backup). GreenHadoop predicts the amount of solar energy that will be available in the near future, and schedules the MapReduce jobs to maximize the green energy consumption within the jobs' time bounds. If brown energy must be used to avoid time bound violations, GreenHadoop selects times when brown energy is cheap, while also managing the cost of peak brown power consumption. Our experimental results demonstrate that GreenHadoop can significantly increase green energy consumption and decrease electricity cost, compared to Hadoop.