Thread prioritization: a thread scheduling mechanism for multiple-context parallel processors

Authors:
S. Fiske;W. J. Dally
Affiliations:
-;-
Venue:
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Year:
1995

Citing 23
Cited 2

Programming for Parallelism

Computer
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors

IEEE Transactions on Computers
Two algorithms for barrier synchronization

International Journal of Parallel Programming
Efficient synchronization primitives for large-scale cache-coherent multiprocessors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis of multithreaded architectures for parallel computing

SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Comparative evaluation of latency reducing and tolerating techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Processor coupling: integrating compile time and runtime scheduling for parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Waiting algorithms for synchronization in large-scale multiprocessors

ACM Transactions on Computer Systems (TOCS)
Experience with fine-grain synchronization in MIMD machines for preconditioned conjugate gradient

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Register relocation: flexible contexts for multithreading

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Interleaving: a multithreading technique targeting multiprocessors and workstations

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The effectiveness of multiple hardware contexts

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Empirical studies of competitve spinning for a shared-memory multiprocessor

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
APRIL: a processor architecture for multiprocessing

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
Performance Tradeoffs in Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Block Scheduling of Iterative Algorithms and Graph-Level Priority Scheduling in a Simulated Data-Flow Multiprocessor

IEEE Transactions on Parallel and Distributed Systems
CACHE COHERENCE PROTOCOLS FOR LARGE-SCALE MULTIPROCESSORS

CACHE COHERENCE PROTOCOLS FOR LARGE-SCALE MULTIPROCESSORS
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR

PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
LOW-COST SUPPORT FOR FINE-GRAIN SYNCHRONIZATION IN MULTIPROCESSORS

LOW-COST SUPPORT FOR FINE-GRAIN SYNCHRONIZATION IN MULTIPROCESSORS
Algorithms for Scalable Synchronization on Shared-Memory Multiproceessors

Algorithms for Scalable Synchronization on Shared-Memory Multiproceessors

Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor

Proceedings of the 25th annual international symposium on Computer architecture
A Microserver View of HTMT

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiple-context processors provide register resources that allow rapid context switching between several threads as a means of tolerating long communication and synchronization latencies. When scheduling threads on such a processor, we must first decide which threads should have their state loaded into the multiple contexts, and second, which loaded thread is to execute instructions at any given time. In this paper we show that both decisions are important, and that incorrect choices can lead to serious performance degradation. We propose thread prioritization as a means of guiding both levels of scheduling. Each thread has a priority that can change dynamically, and that the scheduler uses to allocate as many computation resources as possible to critical threads. We briefly describe its implementation, and we show simulation performance results for a number of simple benchmarks in which synchronization performance is critical.