Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Cilk: an efficient multithreaded runtime system
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
Executing multithreaded programs efficiently
Executing multithreaded programs efficiently
Thread scheduling for multiprogrammed multiprocessors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
The performance of work stealing in multiprogrammed environments (extended abstract)
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Demand-Based Coscheduling of Parallel Jobs on Multiprogrammed Multiprocessors
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Adaptive scheduling with parallelism feedback
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
An Empirical Evaluation ofWork Stealing with Parallelism Feedback
ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
Performance-driven processor allocation
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Adaptive work-stealing with parallelism feedback
ACM Transactions on Computer Systems (TOCS)
Provably Efficient Online Nonclairvoyant Adaptive Scheduling
IEEE Transactions on Parallel and Distributed Systems
Intel threading building blocks
Intel threading building blocks
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
IEEE Transactions on Parallel and Distributed Systems
Featherweight X10: a core calculus for async-finish parallelism
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
The Cilk++ concurrency platform
The Journal of Supercomputing
An adaptive task creation strategy for work-stealing scheduling
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
An approach to resource-aware co-scheduling for CMPs
Proceedings of the 24th ACM International Conference on Supercomputing
Efficient Adaptive Scheduling of Multiprocessors with Stable Parallelism Feedback
IEEE Transactions on Parallel and Distributed Systems
CAB: Cache Aware Bi-tier Task-Stealing in Multi-socket Multi-core Architecture
ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
BWS: balanced work stealing for time-sharing multicores
Proceedings of the 7th ACM european conference on Computer Systems
CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures
Proceedings of the 26th ACM international conference on Supercomputing
WATS: Workload-Aware Task Scheduling in Asymmetric Multi-core Architectures
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Adaptive Cache Aware Bitier Work-Stealing in Multisocket Multicore Architectures
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Traditional work-stealing schedulers perform poorly in multi-programmed multi-core architectures, because all the programs tend to use all the cores and thus incur serious core contention. To relieve this problem, this paper proposes a Demand-aware Work-Stealing (DWS) task scheduler, with which a work-stealing program uses cores according to its realtime demand on the cores. If multiple programs scheduled by DWS run in a multi-core architecture concurrently, the cores are first evenly allocated to the co-running programs. At runtime, if a program cannot fully utilize its cores, it releases some of its allocated cores. Otherwise, if a program demands more cores, it tries to use the free cores released by its co-running programs. Experimental results show that DWS can achieve up to 32.3% performance gain for co-running programs compared to traditional work-stealing schedulers with the ABP yielding mechanism.