Speedup Versus Efficiency in Parallel Systems
IEEE Transactions on Computers
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Characterizations of parallelism in applications and their use in scheduling
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The performance of multiprogrammed multiprocessor scheduling algorithms
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Characterisation of programs for scheduling in multiprogrammed parallel systems
Performance Evaluation
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Application scheduling and processor allocation in multiprogrammed parallel processing systems
Performance Evaluation - Special issue: performance modeling of parallel processing systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Benefits of speedup knowledge in memory-constrained multiprocessor scheduling
Performance Evaluation
Using parallel program characteristics in dynamic processor allocation policies
Performance Evaluation
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Performance counters and state sharing annotations: a unified approach to thread locality
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
First-class user-level threads
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
Modeling Speedup (n) Greater than n
IEEE Transactions on Parallel and Distributed Systems
Reducing Parallel Overheads Through Dynamic Serialization
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Parallel Application Characteristics for Multiprocessor Scheduling Policy Design
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Using Runtime Measured Workload Characteristics in Parallel Processor Scheduling
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Library Implementation of the Nano-Threads Programming Model
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
DITools: application-level support for dynamic extension and flexible composition
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Improving Gang Scheduling through job performance analysis and malleability
ICS '01 Proceedings of the 15th international conference on Supercomputing
Improving Processor Allocation through Run-Time Measured Efficiency
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Dynamic Periodicity Detector: Application to Speedup Computation
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Adaptive time/space sharing with SCOJO
International Journal of High Performance Computing and Networking
An approach to resource-aware co-scheduling for CMPs
Proceedings of the 24th ACM International Conference on Supercomputing
Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications
Proceedings of the 37th annual international symposium on Computer architecture
Realistic workload scheduling policies for taming the memory bandwidth bottleneck of SMPs
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
ADAPT: A framework for coscheduling multithreaded programs
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
The autonomic operating system research project: achievements and future directions
Proceedings of the 50th Annual Design Automation Conference
DWS: Demand-aware Work-Stealing in Multi-programmed Multi-core Architectures
Proceedings of Programming Models and Applications on Multicores and Manycores
Integrating profile-driven parallelism detection and machine-learning-based mapping
ACM Transactions on Architecture and Code Optimization (TACO)
A performance-aware quality of service-driven scheduler for multicore processors
ACM SIGBED Review - Special Issue on the 3rd Embedded Operating System Workshop (EWiLi 2013)
Hi-index | 0.00 |
This work is focused on processor allocation in shared-memory multiprocessor systems, where no knowledge of the application is available when applications are submitted. We perform the processor allocation taking into account the characteristics of the application measured at run-time. We want to demonstrate the importance of an accurate performance analysis and the criteria used to distribute the processors. With this aim, we present the SelfAnalyzer, an approach to dynamically analyzing the performance of applications (speedup, efficiency and execution time), and the Performance-Driven Processor Allocation (PDPA), a new scheduling policy that distributes processors considering both the global conditions of the system and the particular characteristics of running applications. This work also defends the importance of the interaction between the medium-term and the long-term scheduler to control the multiprogramming level in the case of the clairvoyant scheduling pol-icies1. We have implemented our proposal in an SGI Origin2000 with 64 processors and we have compared its performance with that of some scheduling policies proposed so far and with the native IRIX scheduling policy. Results show that the combination of the SelfAnalyzer+PDPA with the medium/long-term scheduling interaction outperforms the rest of the scheduling policies evaluated. The evaluation shows that in workloads where a simple equipartition performs well, the PDPA also performs well, and in extreme workloads where all the applications have a bad performance, our proposal can achieve a speedup of 3.9 with respect to an equipartition and 11.8 with respect to the native IRIX scheduling policy.