The design of the UNIX operating system
The design of the UNIX operating system
The performance of multiprogrammed multiprocessor scheduling algorithms
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Wasted resources in gang scheduling
JCIT Proceedings of the fifth Jerusalem conference on Information technology
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Scheduler activations: effective kernel support for the user-level management of parallelism
ACM Transactions on Computer Systems (TOCS)
Performance analysis of job scheduling policies in parallel supercomputing environments
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Efficient scheduling on multiprogrammed shared-memory multiprocessors
Efficient scheduling on multiprogrammed shared-memory multiprocessors
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Scalable Multi-Discipline, Multiple-Processor Scheduling Framework for IRIX
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
Adaptive two-level thread management for fast MPI execution on shared memory machines
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Dynamic Processor Allocation with the Solaris Operating System
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Hi-index | 0.00 |
Existing techniques for sharing the processing resources in multiprogrammed shared-memory multiprocessors, such as time-sharing, space-sharing, and gang-scheduling, typically sacrifice the performance of individual parallel applications to improve overall system utilization. We present a new processor allocation technique called Loop-Level Process Control (LLPC) that dynamically adjusts the number of processors an application is allowed to use for the execution of each parallel section of code, based on the current system load. This approach exploits the maximum parallelism possible for each application without overloading the system. We implement our scheme on a Silicon Graphics Challenge multiprocessor system and evaluate its performance using applications from the Perfect Club benchmark suite and synthetic benchmarks. Our approach shows significant improvements over traditional time-sharing and gang-scheduling. It has performance comparable to, or slightly better than, static space-sharing, but our strategy is more robust since, unlike static space-sharing, it does not require a priori knowledge of the applications' parallelism characteristics.