SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The performance of work stealing in multiprogrammed environments (extended abstract)
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Data-parallel load balancing stategies
Parallel Computing
Load Balancing in Parallel Computers: Theory and Practice
Load Balancing in Parallel Computers: Theory and Practice
Strategies for Dynamic Load Balancing on Highly Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Developing SPMD applications with load balancing
Parallel Computing
Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
System noise, OS clock ticks, and fine-grained parallel applications
Proceedings of the 19th annual international conference on Supercomputing
ULE: a modern scheduler for FreeBSD
BSDC'03 Proceedings of the BSD Conference 2003 on BSD Conference
The Convergence of Realistic Distributed Load-Balancing Algorithms
Theory of Computing Systems
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A dynamic scheduler for balancing HPC applications
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Scalable Dynamic Load Balancing Using UPC
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Optimizing collective communication on multicores
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Characterizing the impact of using spare-cores on application performance
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Juggle: proactive load balancing on multicore computers
Proceedings of the 20th international symposium on High performance distributed computing
Hi-index | 0.00 |
We investigate proactive dynamic load balancing on multicore systems, in which threads are continually migrated to reduce the impact of processor/thread mismatches. Our goal is to enhance the flexibility of the SPMD-style programming model and enable SPMD applications to run efficiently in multiprogrammed environments. We present Juggle, a practical decentralized, user-space implementation of a proactive load balancer that emphasizes portability and usability. In this paper we assume perfect intrinsic load balance and focus on extrinsic imbalances caused by OS noise, multiprogramming and mismatches of threads to hardware parallelism. Juggle shows performance improvements of up to 80 % over static load balancing for oversubscribed UPC, OpenMP, and pthreads benchmarks. We also show that Juggle is effective in unpredictable, multiprogrammed environments, with up to a 50 % performance improvement over the Linux load balancer and a 25 % reduction in performance variation. We analyze the impact of Juggle on parallel applications and derive lower bounds and approximations for thread completion times. We show that results from Juggle closely match theoretical predictions across a variety of architectures, including NUMA and hyper-threaded systems.