Making effective use of shared-memory multiprocessors: the process control approach

Authors:
Anoop Gupta;Andrew Tucker;Luis Stevens
Affiliations:
-;-;-
Venue:
Making effective use of shared-memory multiprocessors: the process control approach
Year:
1991

Citing 0
Cited 3

Scheduler activations: effective kernel support for the user-level management of parallelism

ACM Transactions on Computer Systems (TOCS)
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Scheduling and page migration for multiprocessor compute servers

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the design, implementation, and performance of a novel approach for effectively utilizing shared-memory multiprocessors in the presence of multiprogramming. Our approach offers high performance by combining the techniques of process control and processor partitioning. The process control technique is based on the principle that to maximize performance, a parallel application must dynamically match the number of runnable processes associated with it to the effective number of processors available to it. This avoids the problems arising from oblivious preemption of processes and it allows an application to work at a better operating point on its speedup versus processors curve. Processor partitioning is necessary for dealing with realistic multiprogramming environments, where both process controlled and non-controlled applications may be present. It also helps improve the cache performance of applications and removes the bottleneck associated with a single centralized scheduler. Preliminary results from an implementation of the process control approach, with a user-level server, were presented in a previous paper. In this paper, we extend the process control approach to work with processor partitioning and fully integrate the approach with the operating system kernel. This also allows us to address a limitation in our earlierEimplementation wherein a close correspondence between runnable processes and the available processors was not maintained in the presence of I/O. The paper presents the design decisions and the rationale for the current implementation, along with extensive results from executions on a high-performance Silicon Graphics 4D/340