Randomization, speculation, and adaptation in batch schedulers
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Managing energy and server resources in hosting centers
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Attacking the bottlenecks of backfilling schedulers
Cluster Computing
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Parallel Job Scheduling: Issues and Approaches
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Core Algorithms of the Maui Scheduler
JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Multiple-Queue Backfilling Scheduling with Priorities and Reservations for Parallel Systems
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration
IEEE Transactions on Parallel and Distributed Systems
Energy Management for Server Clusters
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Power-aware QoS Management in Web Servers
RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
Brief announcement: Cataclysm: handling extreme overloads in internet services
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Self-Adaptive Scheduler Parameterization via Online Simulation
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Energy conservation policies for web servers
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Energy-efficient server clusters
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
Are user runtime estimates inherently inaccurate?
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers
Journal of Parallel and Distributed Computing
On the utility of DVFS for power-aware job placement in clusters
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Balancing electricity bill and performance in server farms with setup costs
Future Generation Computer Systems
Research on Power-Aware Scheduling for High-Performance Computing System
GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
Parallel job scheduling for power constrained HPC systems
Parallel Computing
Hi-index | 0.00 |
Traditionally, scheduling in high-end parallel systems focuses on how to minimize the average job waiting time and on how to maximize the overall system utilization. Despite the development of scheduling strategies that aim at maximizing system utilization, parallel supercomputing traces that span long time periods indicate that such systems are mostly underutilized. Much of the time there is simply not enough load to keep the system fully utilized, although time periods do exist where system utilization levels peak at nearly 95%. In this paper, we propose a new family of scheduling policies that aims at minimizing power consumption and cooling costs by selectively choosing to power down (or put in "sleep" mode) parts of the system during periods of low load. Our goal is the development of a scheduling mechanism that adaptively adjusts the number of processors to the offered load while meeting predefined service-level agreements (SLAs). This scheduling mechanism uses online simulation, i.e., lightweight simulation modules that can execute while the system and its scheduler are in operation, and can guide resource provisioning in parallel systems. Detailed experimentation using traces from the Parallel Workloads Archive indicates that the proposed online mechanism is a viable alternative to conserve energy while meeting performance-based SLAs.