Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Factoring: a method for scheduling parallel loops
Communications of the ACM
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers
IEEE Transactions on Parallel and Distributed Systems
Scalable Loop Self-Scheduling Schemes for Heterogeneous Clusters
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Overhead Analysis of a Dynamic Load Balancing Library for Cluster Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 1 - Volume 02
An Enhanced Parallel Loop Self-Scheduling Scheme for Cluster Environments
The Journal of Supercomputing
Distributed loop-scheduling schemes for heterogeneous computer systems: Research Articles
Concurrency and Computation: Practice & Experience
Development of mixed mode MPI / OpenMP applications
Scientific Programming
Locality and Loop Scheduling on NUMA Multiprocessors
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 02
A performance-based parallel loop scheduling on grid environments
The Journal of Supercomputing
Dynamic partitioning of loop iterations on heterogeneous PC clusters
The Journal of Supercomputing
Parallel Loop Self-Scheduling for Heterogeneous Cluster Systems with Multi-core Computers
APSCC '08 Proceedings of the 2008 IEEE Asia-Pacific Services Computing Conference
ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Loosely-coupled loop scheduling in computational grids
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
Previously we have proposed a Layered Self-Scheduling (LSS) approach that is a hybrid MPI and OpenMP based loop self-scheduling approach for dealing with the heterogeneity problem on a cluster system consisting of multi-core compute nodes, where the allocation functions of several well-known schemes have been modified for better performance. Though LSS provides better performance than the conventional self-scheduling schemes, we found the performance can be improved further after our comprehensive experiments and analyses. The newly proposed task scheduling strategy, called Enhanced Layered Self-Scheduling (ELSS), aims at how to utilize the compute powers of multiple processor cores more efficiently in the master compute node and how to schedule tasks to have more stable performance improvements. We have evaluated the new task scheduling strategy by three benchmark applications: Matrix Multiplication, Monte Carlo Integration, and Mandelbrot Set Computation. It is recommended that the global scheduler adopts Guided Self-Scheduling (GSS) for all, and the local scheduler adopts the static scheme for applications with regular workload distribution but any scheme for applications with irregular workload distribution. Experimental results show the best speedups obtained by ELSS for the three benchmark programs are 1.373, 13.34 and 2.4, respectively, compared with that scheduled by LSS.