A 'cool' load balancer for parallel applications

Authors:
Osman Sarood;Laxmikant V. Kale
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Year:
2011

Citing 14
Cited 3

NAMD: biomolecular simulation on thousands of processors

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Achieving high performance on extremely large parallel machines: performance prediction and load balancing

Achieving high performance on extremely large parallel machines: performance prediction and load balancing
Thermal-Aware Task Scheduling to Minimize Energy Usage of Blade Server Based Datacenters

DASC '06 Proceedings of the 2nd IEEE International Symposium on Dependable, Autonomic and Secure Computing
Balancing power consumption in multiprocessor systems

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Bounding energy consumption in large-scale MPI programs

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Cool job allocation: measuring the power savings of placing jobs at cooling-efficient locations in the data center

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach

IEEE Transactions on Parallel and Distributed Systems
Temperature-Aware Scheduling: When is System-Throttling Good Enough?

WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
Adagio: making DVS practical for complex HPC applications

Proceedings of the 23rd international conference on Supercomputing
Towards Thermal Aware Workload Scheduling in a Data Center

ISPAN '09 Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks
Reducing data center energy consumption via coordinated cooling and load management

HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
Cooling-aware and thermal-aware workload placement for green HPC data centers

GREENCOMP '10 Proceedings of the International Conference on Green Computing
Temperature Aware Load Balancing for Parallel Applications: Preliminary Work

IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum

T: a data-centric cooling energy costs reduction approach for big data analytics cloud

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A 'cool' way of improving the reliability of HPC machines

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Thermal Modeling of Hybrid Storage Clusters

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Meeting power requirements of huge exascale machines of the future will be a major challenge. Our focus in this paper is to minimize cooling power and we propose a technique that uses a combination of DVFS and temperature aware load balancing to constrain core temperatures as well as save cooling energy. Our scheme is specifically designed to suit parallel applications which are typically tightly coupled. The temperature control, comes at the cost of execution time and we try to minimize the timing penalty. We experiment with three applications (with different power utilization profiles), run on a 128-core (32-node) cluster with a dedicated air conditioning unit. We calibrate the efficacy of our scheme based on three metrics: ability to control average core temperatures thereby avoiding hot spot occurence, timing penalty minimization, and cooling energy savings. Our results show cooling energy savings of up to 57% with a timing penalty of 19%.