IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Coarse grained parallel computing on heterogeneous systems
SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
Parallel data mining for association rules on shared-memory multi-processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Parallel FFT on ATM-based networks of workstations
Cluster Computing
IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Customized dynamic load balancing for a network of workstations
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Dual and multiple token based approaches for load balancing
Journal of Systems Architecture: the EUROMICRO Journal
Scalable loop self-scheduling schemes for heterogeneous clusters
International Journal of Computational Science and Engineering
Dynamic multi phase scheduling for heterogeneous cluste
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Load and performance balancing scheme for heterogeneous parallel processing
CIS'04 Proceedings of the First international conference on Computational and Information Science
Partitioning and scheduling loops on NOWs
Computer Communications
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
In this paper we study the problem of scheduling parallel loops at compile-time for a heterogeneous network of machines. We consider heterogeneity in three aspects of parallel programming: program, processor and network. A heterogeneous program has parallel loops with different amount of work in each iteration; heterogeneous processors have different speeds; and a heterogeneous network has different cost of communication between processors. We propose a simple yet comprehensive model for use in compiling for a network of processors, and develop compiler algorithms for generating optimal and sub-optimal schedules of loops for load balancing, communication optimizations and network contention. Experiments show that a significant improvement of performance is achieved using our techniques.