Performance analysis of local computer networks
Performance analysis of local computer networks
Theory of linear and integer programming
Theory of linear and integer programming
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Communications of the ACM
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Ultracomputers: a teraflop before its time
Communications of the ACM
A dynamic scheduling method for irregular parallel programs
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Using processor affinity in loop scheduling on shared-memory multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Integration, the VLSI Journal
Using MPI: portable parallel programming with the message-passing interface
Using MPI: portable parallel programming with the message-passing interface
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Optimal tile size adjustment in compiling general DOACROSS loop nests
ICS '95 Proceedings of the 9th international conference on Supercomputing
Parallel execution of iterative computations on workstation clusters
Journal of Parallel and Distributed Computing
Communication-minimal tiling of uniform dependence loops
Journal of Parallel and Distributed Computing
Determining the idle time of a tiling
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compile-time minimisation of load imbalance in loop nests
ICS '97 Proceedings of the 11th international conference on Supercomputing
Reuse-driven tiling for improving data locality
International Journal of Parallel Programming
Statistical Models in S
Using the Memory Channel Network
IEEE Micro
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
Impact of memory hierarchy on program partitioning and scheduling
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Loop scheduling for heterogeneity
HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
Customized Dynamic Load Balancing for a Network of Workstations
Customized Dynamic Load Balancing for a Network of Workstations
Automatic Blocking of Nested Loops
Automatic Blocking of Nested Loops
Determining the Idle Time of a Tiling: New Results
Determining the Idle Time of a Tiling: New Results
A high-performance end system architecture for real-time CORBA
IEEE Communications Magazine
Concurrency and Computation: Practice & Experience
Hi-index | 0.24 |
This paper addresses the problem of partitioning and scheduling loops for a network of heterogeneous workstations. By isolating the effects of send and receive and quantifying the impact of network contention on the overall communication cost, a simple yet accurate cost model for predicting the communication overhead for a pair of workstations is presented. The processing capacities of all workstations in a network are modeled based on their CPU speeds and memory sizes. Based on these models, loop tiling is used extensively to partition and schedule loops across the workstations. By adjusting sizes, i.e. the granularities of tasks, the impact of the heterogeneity arising from program, processor and network is minimised. Experimental results on an Ethernet of seven DEC workstations demonstrate the effectiveness of our models and parallelisation strategies.