Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Factoring: a method for scheduling parallel loops
Communications of the ACM
Parallel hierarchical N-body methods and their implications for multiprocessors
Parallel hierarchical N-body methods and their implications for multiprocessors
A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Balancing processor loads and exploiting data locality in N-body simulations
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Load-sharing in heterogeneous systems via weighted factoring
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
The optimal effectiveness metric for parallel application analysis
Information Processing Letters - Special issue on parallel models
Dynamic repartitioning of adaptively refined meshes
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
S-HARP: a scalable parallel dynamic partitioner for adaptive mesh-based computations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Load Balancing Highly Irregular Computations with the Adaptive Factoring
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
DRAMA: A Library for Parallel Dynamic Load Balancing of Finite Element Applications
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Mesh partitioning for distributed systems
Mesh partitioning for distributed systems
Message-passing parallel adaptive quantum trajectory method
High performance scientific and engineering computing
Vector nonlinear time-series analysis of gamma-ray burst datasets on heterogeneous clusters
Scientific Programming - International Symposium of Parallel and Distributed Computing & International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogenous Networks
Dynamic load balancing with adaptive factoring methods in scientific applications
The Journal of Supercomputing
Simulation of a hybrid model for image denoising
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Investigating asymptotic properties of vector nonlinear time series models
Journal of Computational and Applied Mathematics
A parameter study of a hybrid Laplacian mean-curvature flow denoising model
The Journal of Supercomputing
Dynamic load balancing with MatlabMPI
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Hi-index | 0.00 |
Large scale applications typically contain parallel loops with many iterates. The iterates of a parallel loop may have variable execution times which translate into performance degradation of an application due to load imbalance. This paper describes a tool for load balancing parallel loops on distributed-memory systems. The tool assumes that the data for a parallel loop to be executed is already partitioned among the participating processors. The tool utilizes the MPI library for interprocessor coordination, and determines processor workloads by loop scheduling techniques. The tool was designed independent of any application; hence, it must be supplied with a routine that encapsulates the computations for a chunk of loop iterates, as well as the routines to transfer data and results between processors. Performance evaluation on a Linux cluster indicates that the tool reduces the cost of executing a simulated irregular loop without load balancing by up to 81%. The tool is useful for parallelizing sequential applications with parallel loops, or as an alternate load balancing routine for existing parallel applications.