Programming perl
Scheduling with implicit information in distributed systems
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Analysis of Processor Allocation in Multiprogrammed, Distributed-Memory Parallel Processing Systems
IEEE Transactions on Parallel and Distributed Systems
The ANL/IBM SP Scheduling System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Parallel Job Scheduling: Issues and Approaches
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
The EASY - LoadLeveler API Project
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
PScheD: Political Scheduling on the CRAY T3E
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Dynamic Coscheduling on Workstation Clusters
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Gang scheduling for highly efficient, distributed multiprocessor systems
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Advanced information technology support for life sciences research
SIGUCCS '03 Proceedings of the 31st annual ACM SIGUCCS fall conference
Power use of disk subsystems in supercomputers
Proceedings of the sixth workshop on Parallel Data Storage
Hi-index | 0.00 |
The Penn State RS/6000 SP is a uniquely acquired and operated computing facility. This 143 CPU machine, centrally located and jointly owned, is a result of collaboration between academic departments, research groups, and the central academic computing facility. It is the largest on-campus resource at Penn State for meeting the high performance computing needs.Due to the joint ownership structure of the machine, the job scheduling requirements are significantly different from the usual methods of job processor allocation in distributed memory parallel machines. After several years of adapting different queuing systems, primarily the Distributed Queuing System, to our needs, it became obvious that the conventional scheduling systems did not serve the machine scheduling requirements unique to the Penn State SP. We concluded that a robust and easily configurable system needs to be developed to meet our unique needs. We have drawn inspiration from and modeled our system on EASY. As with EASY, we use the application programming interface of LoadLeveler to implement our scheduler. Our scheduler is named Penn State Condominium Scheduler (PSCS). PSCS does policy implementation and job execution on the machine is done by LoadLeveler.PSCS is written to facilitate easier configuration and administration. It does not have any processor architecture dependence. It is similar to the native scheduler in LoadLeveler in this regard. PSCS has incorporated three unique features: (i) node owner affinity which ensures fairness by allocation based on ownership, (ii) backfilling which ensures efficient utilization of resources, and (iii) affinity for services provided which ensures proper matching of jobs to the processors based on memory, software and other requirements. Jobs from users who own nodes in the SP complex have affinity to those particular processors owned by them. They also have preferences granted to them depending on their ownership level. Once the demand from the node owners is met, the next important goal is to keep the machine as fully occupied with running jobs as possible. This is accomplished by backfilling. This scheduler incorporates these features which are most important to successful implementation of multi-owner, centrally located, heterogeneous computing facilities.