Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Handling long-latency loads in a simultaneous multithreading processor
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Soft Real- Time Scheduling on Simultaneous Multithreaded Processors
RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
Initial Observations of the Simultaneous Multithreading Pentium 4 Processor
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
The Impact of Resource Partitioning on SMT Processors
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Architectural Support for Enhanced SMT Job Scheduling
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Dynamically Controlled Resource Allocation in SMT Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Learning-Based SMT Processor Resource Distribution via Hill-Climbing
Proceedings of the 33rd annual international symposium on Computer Architecture
Predictable Performance in SMT Processors: Synergy between the OS and SMTs
IEEE Transactions on Computers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Hyper-threading aware process scheduling heuristics
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Fairness enforcement in switch on event multithreading
ACM Transactions on Architecture and Code Optimization (TACO)
A Memory-Level Parallelism Aware Fetch Policy for SMT Processors
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Software-Controlled Priority Characterization of POWER5 Processor
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Per-thread cycle accounting in SMT processors
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Compatible phase co-scheduling on a CMP of multi-threaded processors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A workload-aware mapping approach for data-parallel programs
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Efficient interaction between OS and architecture in heterogeneous platforms
ACM SIGOPS Operating Systems Review
The impact of memory subsystem resource sharing on datacenter applications
Proceedings of the 38th annual international symposium on Computer architecture
A fault-tolerant, dynamically scheduled pipeline structure for chip multiprocessors
SAFECOMP'11 Proceedings of the 30th international conference on Computer safety, reliability, and security
Optimal task assignment in multithreaded processors: a statistical approach
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
An efficient CPI stack counter architecture for superscalar processors
Proceedings of the great lakes symposium on VLSI
Measuring interference between live datacenter applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Adaptive instruction dispatching techniques for Simultaneous Multi-Threading (SMT) processors
Computers and Electrical Engineering
Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers
Proceedings of the 40th Annual International Symposium on Computer Architecture
L1-bandwidth aware thread allocation in multicore SMT processors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Computers and Electrical Engineering
Hi-index | 0.00 |
Symbiotic job scheduling boosts simultaneous multithreading (SMT) processor performance by co-scheduling jobs that have `compatible' demands on the processor's shared resources. Existing approaches however require a sampling phase, evaluate a limited number of possible co-schedules, use heuristics to gauge symbiosis, are rigid in their optimization target, and do not preserve system-level priorities/shares. This paper proposes probabilistic job symbiosis modeling, which predicts whether jobs will create positive or negative symbiosis when co-scheduled without requiring the co-schedule to be evaluated. The model, which uses per-thread cycle stacks computed through a previously proposed cycle accounting architecture, is simple enough to be used in system software. Probabilistic job symbiosis modeling provides six key innovations over prior work in symbiotic job scheduling: (i) it does not require a sampling phase, (ii) it readjusts the job co-schedule continuously, (iii) it evaluates a large number of possible co-schedules at very low overhead, (iv) it is not driven by heuristics, (v) it can optimize a performance target of interest (e.g., system throughput or job turnaround time), and (vi) it preserves system-level priorities/shares. These innovations make symbiotic job scheduling both practical and effective. Our experimental evaluation, which assumes a realistic scenario in which jobs come and go, reports an average 16% (and up to 35%) reduction in job turnaround time compared to the previously proposed SOS (sample, optimize, symbios) approach for a two-thread SMT processor, and an average 19% (and up to 45%) reduction in job turnaround time for a four-thread SMT processor.