Predictable Performance in SMT Processors: Synergy between the OS and SMTs

Authors:
Francisco J. Cazorla;Peter M. W. Knijnenburg;Rizos Sakellariou;Enrique Fernandez;Alex Ramirez;Mateo Valero
Affiliations:
-;-;-;-;-;IEEE
Venue:
IEEE Transactions on Computers
Year:
2006

Citing 11
Cited 15

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Power4 System Design for High Reliability

IEEE Micro
Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Front-End Policies for Improved Issue Efficiency in SMT Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Soft Real- Time Scheduling on Simultaneous Multithreaded Processors

RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

Proceedings of the 30th annual international symposium on Computer architecture
Virtual simple architecture (VISA): exceeding the complexity limit in safe real-time systems

Proceedings of the 30th annual international symposium on Computer architecture
An Analysis of Cache Performance of Multimedia Applications

IEEE Transactions on Computers
Dynamically Controlled Resource Allocation in SMT Processors

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture

Software-Controlled Priority Characterization of POWER5 Processor

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Per-thread cycle accounting in SMT processors

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
FlexDCP: a QoS framework for CMP architectures

ACM SIGOPS Operating Systems Review
Service level agreement for multithreaded processors

ACM Transactions on Architecture and Code Optimization (TACO)
Probabilistic job symbiosis modeling for SMT processor scheduling

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Dynamic task set partitioning based on balancing resource requirements and utilization to reduce power consumption

Proceedings of the 2010 ACM Symposium on Applied Computing
Soft real-time scheduling on SMT processors with explicit resource allocation

ARCS'08 Proceedings of the 21st international conference on Architecture of computing systems
Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
A dynamic power-aware partitioner with task migration for multicore embedded systems

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
A QoS Guaranteed Cache Design for Environment Friendly Computing

GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
Extending a multicore multithread simulator to model power-aware hard real-time systems

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Probabilistic modeling for job symbiosis scheduling on SMT processors

ACM Transactions on Architecture and Code Optimization (TACO)
Making data prefetch smarter: adaptive prefetching on POWER7

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling

Proceedings of the ACM International Conference on Computing Frontiers
A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness

Proceedings of the 40th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	14.98

Visualization

Abstract

Current Operating Systems (OS) perceive the different contexts of Simultaneous Multithreaded (SMT) processors as multiple independent processing units, although, in reality, threads executed in these units compete for the same hardware resources. Furthermore, hardware resources are assigned to threads implicitly as determined by the SMT instruction fetch (Ifetch) policy, without the control of the OS. Both factors cause a lack of control over how individual threads are executed, which can frustrate the work of the job scheduler. This presents a problem for general purpose systems, where the OS job scheduler cannot enforce priorities, and also for embedded systems, where it would be difficult to guarantee worst-case execution times. In this paper, we propose a novel strategy that enables a two-way interaction between the OS and the SMT processor and allows the OS to run jobs at a certain percentage of their maximum speed, regardless of the workload in which these jobs are executed. In contrast to previous approaches, our approach enables the OS to run time-critical jobs without dedicating all internal resources to them so that non-time-critical jobs can make significant progress as well and without significantly compromising overall throughput. In fact, our mechanism, in addition to fulfilling OS requirements, achieves 90 precent of the throughput of one of the best currently known fetch policies for SMTs.