Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Factoring: a method for scheduling parallel loops
Communications of the ACM
Adaptive cubature over a collection of triangles using the d-transformation
ICCAM'92 Proceedings of the fifth international conference on Computational and applied mathematics
Load-sharing in heterogeneous systems via weighted factoring
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Comments on the Nature of Automatic Quadrature Routines
ACM Transactions on Mathematical Software (TOMS)
Performance of Scheduling Scientific Applications with Adaptive Weighted Factoring
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Message-passing parallel adaptive quantum trajectory method
High performance scientific and engineering computing
Simulation of Vector Nonlinear Time Series Models on Clusters
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
Vector nonlinear time-series analysis of gamma-ray burst datasets on heterogeneous clusters
Scientific Programming - International Symposium of Parallel and Distributed Computing & International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogenous Networks
Simulation of a hybrid model for image denoising
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Investigating asymptotic properties of vector nonlinear time series models
Journal of Computational and Applied Mathematics
A parameter study of a hybrid Laplacian mean-curvature flow denoising model
The Journal of Supercomputing
Computational challenges in vector functional coefficient autoregressive models
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Dynamic load balancing with MatlabMPI
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Autotuning of adaptive mesh refinement PDE solvers on shared memory architectures
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Hi-index | 0.00 |
To improve performance of scientific applications in parallel and distributed environments, dynamic scheduling algorithms for parallel loops have been proposed. Such algorithms address performance degradations due to load imbalance caused by predictable phenomena like nonuniform data distribution or algorithmic variance, and unpredictable phenomena such as data access latency or operating system interference. In particular, algorithms such as factoring, weighted factoring, adaptive weighted factoring, and adaptive factoring have been developed based on a probabilistic analysis of parallel loop iterates with variable running times. These algorithms execute the iterates in variable size chunks, where the sizes are determined such that the chunks complete before the optimal time with a high probability. These algorithms have successfully been implemented in a number of scientific applications such as: N-Body and Monte Carlo simulations, CFD, and radar signal processing. This paper presents a comparative study of the performance of various loop scheduling algorithms in a message-passing environment. The algorithms have been integrated into a tool for executing parallel loops, and the tool applied in profiling quadrature routines that are often used in scientific computations such as finite element methods, particle physics, and multivariate statistics. Experimental results reveal the effectiveness and robustness of the latest developed scheduling algorithms over the previous ones on loops with irregular iterate execution times.