Dynamic load balancing for distributed memory multiprocessors
Journal of Parallel and Distributed Computing
Self-scheduling on distributed-memory machines
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Parallel image processing applications on a network of workstations
Parallel Computing
Parallel dynamic graph partitioning for adaptive unstructured meshes
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Multilevel diffusion schemes for repartitioning of adaptive meshes
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
PLUM: parallel load balancing for adaptive unstructured meshes
Journal of Parallel and Distributed Computing
An improved diffusion algorithm for dynamic load balancing
Parallel Computing
Predicting the cost and benefit of adapting data parallel applications in clusters
Journal of Parallel and Distributed Computing
Dynamic load balancing of SAMR applications on distributed systems
Scientific Programming - Best papers from SC 2001
A load balancing strategy for computations on large, read-only data sets
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Hi-index | 0.00 |
We study strategies for redistributing the load in an adaptive finite element computation performed on a cluster of workstations. The cluster is assumed to be a heterogeneous, multi-user computing environment. The performance of a particular processor depends on both static factors, such as the processor hardware and dynamic factors, such as the system load and the work of other users. On a network, it is assumed that all processors are connected, but the topology of the finite element sub-domains can be interpreted as a processor topology and hence for each processor, it is possible to define set of neighbours. In finite element analysis, the quantity of computation on a processor is proportional to the size of the sub-domain plus some contribution from the neighbours. We consider schemes that modify the sub-domains by, in general, moving data to adjacent processors. The numerical experiments show the efficiency of the approach.