NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Achieving high performance on extremely large parallel machines: performance prediction and load balancing
Thermal-Aware Task Scheduling to Minimize Energy Usage of Blade Server Based Datacenters
DASC '06 Proceedings of the 2nd IEEE International Symposium on Dependable, Autonomic and Secure Computing
Balancing power consumption in multiprocessor systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Bounding energy consumption in large-scale MPI programs
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
IEEE Transactions on Parallel and Distributed Systems
Temperature-Aware Scheduling: When is System-Throttling Good Enough?
WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
Adagio: making DVS practical for complex HPC applications
Proceedings of the 23rd international conference on Supercomputing
Towards Thermal Aware Workload Scheduling in a Data Center
ISPAN '09 Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks
Reducing data center energy consumption via coordinated cooling and load management
HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
Cooling-aware and thermal-aware workload placement for green HPC data centers
GREENCOMP '10 Proceedings of the International Conference on Green Computing
Temperature Aware Load Balancing for Parallel Applications: Preliminary Work
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
T: a data-centric cooling energy costs reduction approach for big data analytics cloud
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A 'cool' way of improving the reliability of HPC machines
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Thermal Modeling of Hybrid Storage Clusters
Journal of Signal Processing Systems
Hi-index | 0.00 |
Meeting power requirements of huge exascale machines of the future will be a major challenge. Our focus in this paper is to minimize cooling power and we propose a technique that uses a combination of DVFS and temperature aware load balancing to constrain core temperatures as well as save cooling energy. Our scheme is specifically designed to suit parallel applications which are typically tightly coupled. The temperature control, comes at the cost of execution time and we try to minimize the timing penalty. We experiment with three applications (with different power utilization profiles), run on a 128-core (32-node) cluster with a dedicated air conditioning unit. We calibrate the efficacy of our scheme based on three metrics: ability to control average core temperatures thereby avoiding hot spot occurence, timing penalty minimization, and cooling energy savings. Our results show cooling energy savings of up to 57% with a timing penalty of 19%.