Power-aware resource allocation in high-end systems via online simulation

Authors:
Barry Lawson;Evgenia Smirni
Affiliations:
University of Richmond, Richmond, VA;College of William and Mary, Williamsburg, VA
Venue:
Proceedings of the 19th annual international conference on Supercomputing
Year:
2005

Citing 16
Cited 5

Randomization, speculation, and adaptation in batch schedulers

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling

IEEE Transactions on Parallel and Distributed Systems
Managing energy and server resources in hosting centers

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Attacking the bottlenecks of backfilling schedulers

Cluster Computing
Supporting Priorities and Improving Utilization of the IBM SP Scheduler Using Slack-Based Backfilling

IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Parallel Job Scheduling: Issues and Approaches

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Core Algorithms of the Maui Scheduler

JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Multiple-Queue Backfilling Scheduling with Priorities and Reservations for Parallel Systems

JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration

IEEE Transactions on Parallel and Distributed Systems
Energy Management for Server Clusters

HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Power-aware QoS Management in Web Servers

RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
Brief announcement: Cataclysm: handling extreme overloads in internet services

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Self-Adaptive Scheduler Parameterization via Online Simulation

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Energy conservation policies for web servers

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Energy-efficient server clusters

PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
Are user runtime estimates inherently inaccurate?

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing

Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

Journal of Parallel and Distributed Computing
On the utility of DVFS for power-aware job placement in clusters

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Balancing electricity bill and performance in server farms with setup costs

Future Generation Computer Systems
Research on Power-Aware Scheduling for High-Performance Computing System

GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
Parallel job scheduling for power constrained HPC systems

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditionally, scheduling in high-end parallel systems focuses on how to minimize the average job waiting time and on how to maximize the overall system utilization. Despite the development of scheduling strategies that aim at maximizing system utilization, parallel supercomputing traces that span long time periods indicate that such systems are mostly underutilized. Much of the time there is simply not enough load to keep the system fully utilized, although time periods do exist where system utilization levels peak at nearly 95%. In this paper, we propose a new family of scheduling policies that aims at minimizing power consumption and cooling costs by selectively choosing to power down (or put in "sleep" mode) parts of the system during periods of low load. Our goal is the development of a scheduling mechanism that adaptively adjusts the number of processors to the offered load while meeting predefined service-level agreements (SLAs). This scheduling mechanism uses online simulation, i.e., lightweight simulation modules that can execute while the system and its scheduler are in operation, and can guide resource provisioning in parallel systems. Detailed experimentation using traces from the Parallel Workloads Archive indicates that the proposed online mechanism is a viable alternative to conserve energy while meeting performance-based SLAs.