Towards an architecture-independent analysis of parallel algorithms
SIAM Journal on Computing
Parallel algorithms for shared-memory machines
Handbook of theoretical computer science (vol. A)
General purpose parallel architectures
Handbook of theoretical computer science (vol. A)
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal broadcast and summation in the LogP model
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
On the Granularity and Clustering of Directed Acyclic Task Graphs
IEEE Transactions on Parallel and Distributed Systems
An Approach to Machine-Independent Parallel Programming
CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
Upper time bounds for executing PRAM-programs on the LogP-machine
ICS '95 Proceedings of the 9th international conference on Supercomputing
Hi-index | 0.00 |
Currently, many parallel algorithms are defined for shared memory architectures. The preferred machine model is the PRAM, but this model does not take into account properties of existing architectures that have a distributed memory and an asynchronous execution model. A transformation of PRAM programs into distributed, asynchronous ones is known. In order to produce not only correct but also efficient code the tasks have to be clustered. We introduce a parallel algorithm producing an optimal clustering for coarse grained task graphs with respect to the execution time on an asynchronous distributed random access machine, the A-DRAM. This machine model assumes distributed memory, asynchronous execution of tasks, computation costs, and communication delay.