Scheduling in multiprogrammed parallel systems
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Evaluation of design choices for gang scheduling using distributed hierarchical control
Journal of Parallel and Distributed Computing
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
Trace-driven Analysis of Migration-based Gang Scheduling Policies for Parallel Computers
ICPP '97 Proceedings of the international Conference on Parallel Processing
Improving Processor Allocation through Run-Time Measured Efficiency
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
The ANL/IBM SP Scheduling System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Packing Schemes for Gang Scheduling
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Using Runtime Measured Workload Characteristics in Parallel Processor Scheduling
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Towards Convergence in Job Schedulers for Parallel Supercomputers
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Improved Utilization and Responsiveness with Gang Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Improving Throughput and Utilization in Parallel Machines through Concurrent Gang
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Utilization and Predictability in Scheduling the IBM SP2 with Backfilling
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Performance-driven processor allocation
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Adaptive time/space sharing with SCOJO
International Journal of High Performance Computing and Networking
A novel approach for distributed application scheduling based on prediction of communication events
Future Generation Computer Systems
Adaptive job scheduling via predictive job resource allocation
JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
An approach to resource-aware co-scheduling for CMPs
Proceedings of the 24th ACM International Conference on Supercomputing
Hi-index | 0.00 |
The OpenMP programming model provides parallel applications a very important feature: job malleability. Job malleability is the capacity of an application to dynamically adapt its parallelism to the number of processors allocated to it. We believe that job malleability provides to applications the flexibility that a system needs to achieve its maximum performance. We also defend that a system has to take its decisions not only based on user requirements but also based on run-time performance measurements to ensure the efficient use of resources. Job malleability is the application characteristic that makes possible the run-time performance analysis. Without malleability applications would not be able to adapt their parallelism to the system decisions. To support these ideas, we present two new approaches to attack the two main problems of Gang Scheduling: the excessive number of time slots and the fragmentation. Our first proposal is to apply a scheduling policy inside each time slot of Gang Scheduling to distribute processors among applications considering their efficiency, calculated based on run-time measurements. We call this policy Performance-Driven Gang Scheduling. Our second approach is a new re-packing algorithm, Compress&Join, that exploits the job malleability. This algorithm modifies the processor allocation of running applications to adapt it to the system necessities and minimize the fragmentation and number of time slots. These proposals have been implemented in a SGI Origin 2000 with 64 processors. Results show the validity and convenience of both, to consider the job performance analysis calculated at run-time to decide the processor allocation, and to use a flexible programming model that adapts applications to system decisions.