PAM: a novel performance/power aware meta-scheduler for multi-core systems

Authors:
Mohammad Banikazemi;Dan Poff;Bulent Abali
Affiliations:
IBM Thomas J. Watson Research Center, Hawthorne, NY;IBM Thomas J. Watson Research Center, Hawthorne, NY;IBM Thomas J. Watson Research Center, Hawthorne, NY
Venue:
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Year:
2008

Citing 12
Cited 13

An Analytical Model for Designing Memory Hierarchies

IEEE Transactions on Computers
Cache Memories

ACM Computing Surveys (CSUR)
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Redeeming IPC as a Performance Metric for Multithreaded Programs

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Dynamic Partitioning of Shared Cache Memory

The Journal of Supercomputing
Architectural support for operating system-driven CMP cache management

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
IPC Considered Harmful for Multiprocessor Workloads

IEEE Micro
Performance of multithreaded chip multiprocessors and implications for operating system design

ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Server-Level Power Control

ICAC '07 Proceedings of the Fourth International Conference on Autonomic Computing
Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques

Load balancing on speed

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Synthesizing contention

Proceedings of the Workshop on Binary Instrumentation and Applications
An approach to resource-aware co-scheduling for CMPs

Proceedings of the 24th ACM International Conference on Supercomputing
Directly characterizing cross core interference through contention synthesis

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead

Proceedings of the international symposium on Memory management
The impact of memory subsystem resource sharing on datacenter applications

Proceedings of the 38th annual international symposium on Computer architecture
Contentiousness vs. sensitivity: improving contention aware runtime systems on multicore architectures

Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Compiling for niceness: mitigating contention for QoS in warehouse scale computers

Proceedings of the Tenth International Symposium on Code Generation and Optimization
Scalability-based manycore partitioning

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Survey of scheduling techniques for addressing shared resources in multicore processors

ACM Computing Surveys (CSUR)
Providing performance guarantees in multipass network processors

IEEE/ACM Transactions on Networking (TON)
A Machine Learning Based Meta-Scheduler for Multi-Core Processors

International Journal of Adaptive, Resilient and Autonomic Systems
ReQoS: reactive static/dynamic compilation for QoS in warehouse scale computers

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sharing resources such as caches and main memory bandwidth in multi-core systems requires a more sophisticated scheduling scheme. PAM is a low-overhead, user-level meta-scheduler which does not require any hardware or software changes. In particular, it operates by detecting resource congestions and providing guidelines to the standard system scheduler by limiting the assignment of processes to subsets of available cores. PAM contains a cache model that it uses to predict the impact of new schedules. PAM can be used to improve the system along three dimensions: performance, power, and energy consumption (and any combination of these three). On our prototype, we show individual benchmarks can improve by up to 33% and the overall system performance can be improved by as much as 14%.