Service level agreement for multithreaded processors

Authors:
Ron Gabor;Avi Mendelson;Shlomo Weiss
Affiliations:
Tel Aviv University and Intel Corporation;Microsoft Corporation;Tel Aviv University
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2009

Citing 39
Cited 0

Strategies for achieving improved processor throughput

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The effectiveness of multiple hardware contexts

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Determining end-to-end delay bounds in heterogeneous networks

Multimedia Systems - Special issue on the fifth workshop on network and operating system support for digital audio and video 1995 (NOSSDAV)
The simulation and evaluation of dynamic voltage scaling algorithms

ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Symbiotic jobscheduling for a simultaneous mutlithreading processor

ACM SIGPLAN Notices
Handling long-latency loads in a simultaneous multithreading processor

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Evaluation of Multithreaded Processors and Thread-Switch Policies

ISHPC '97 Proceedings of the International Symposium on High Performance Computing
Classifying scheduling policies with respect to unfairness in an M/GI/1

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamic Thermal Management for High-Performance Microprocessors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Initial Observations of the Simultaneous Multithreading Pentium 4 Processor

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
The Impact of Resource Partitioning on SMT Processors

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
A resource-allocation queueing fairness measure

Proceedings of the joint international conference on Measurement and modeling of computer systems
CQoS: a framework for enabling QoS in shared caches of CMP platforms

Proceedings of the 18th annual international conference on Supercomputing
Prophet/Critic Hybrid Branch Prediction

Proceedings of the 31st annual international symposium on Computer architecture
QoS for High-Performance SMT Processors in Embedded Systems

IEEE Micro
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Dynamically Controlled Resource Allocation in SMT Processors

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Virtual Machine Monitors: Current Technology and Future Trends

Computer
Montecito: A Dual-Core, Dual-Thread Itanium Processor

IEEE Micro
Niagara: A 32-Way Multithreaded Sparc Processor

IEEE Micro
Perceptron-Based Branch Confidence Estimation

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Fast and fair: data-stream quality of service

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
METERG: Measurement-Based End-to-End Performance Estimation Technique in QoS-Capable Multiprocessors

RTAS '06 Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium
Architectural support for operating system-driven CMP cache management

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Predictable Performance in SMT Processors: Synergy between the OS and SMTs

IEEE Transactions on Computers
Fairness and Throughput in Switch on Event Multithreading

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Fair Queuing Memory Systems

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Improving fairness, throughput and energy-efficiency on a chip multiprocessor through DVFS

ACM SIGARCH Computer Architecture News
Virtual private caches

Proceedings of the 34th annual international symposium on Computer architecture
QoS policies and architecture for cache/memory in CMP platforms

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Cooperative cache partitioning for chip multiprocessors

Proceedings of the 21st annual international conference on Supercomputing
Fairness enforcement in switch on event multithreading

ACM Transactions on Architecture and Code Optimization (TACO)
Load Sharing in Distributed Systems

IEEE Transactions on Computers
Petascale Computing Research Challenges - A Manycore Perspective

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
A Framework for Providing Quality of Service in Chip Multi-Processors

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A multithreaded PowerPC processor for commercial servers

IBM Journal of Research and Development
FROCM: a fair and low-overhead method in SMT processor

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multithreading is widely used to increase processor throughput. As the number of shared resources increase, managing them while guaranteeing predicted performance becomes a major problem. Attempts have been made in previous work to ease this via different fairness mechanisms. In this article, we present a new approach to control the resource allocation and sharing via a service level agreement (SLA)-based mechanism; that is, via an agreement in which multithreaded processors guarantee a minimal level of service to the running threads. We introduce a new metric, CSLA, for conformance to SLA in multithreaded processors and show that controlling resources using with SLA allows for higher gains than are achievable by previously suggested fairness techniques. It also permits improving one metric (e.g., power) while maintaining SLA in another (e.g., performance). We compare SLA enforcement to schemes based on other fairness metrics, which are mostly targeted at equalizing execution parameters. We show that using SLA rather than fairness based algorithms provides a range of acceptable execution points from which we can select the point that best fits our optimization target, such as maximizing the weighted speedup (sum of the speedups of the individual threads) or reducing power. We demonstrate the effectiveness of the new SLA approach using switch-on-event (coarse-grained) multithreading. Our weighted speedup improvement scheme successfully enforces SLA while improving the weighted speedup by an average of 10% for unbalanced threads. This result is significant when compared with performance losses that may be incurred by fairness enforcement methods. When optimizing for power reduction in unbalanced threads SLA enforcement reduces the power by an average of 15%. SLA may be complemented by other power reduction methods to achieve further power savings and maintain the same service level for the threads. We also demonstrate differentiated SLA, where weighted speedup is maximized while each thread may have a different throughput constraint.