The analysis of algorithms
Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Low overhead parallel schedules for task graphs
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Asynchronous shared memory parallel computation
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
The expected advantage of asynchrony
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
The String-to-String Correction Problem
Journal of the ACM (JACM)
Computing Partitions with Applications to the Knapsack Problem
Journal of the ACM (JACM)
Time Bounded Random Access Machines with Parallel Processing
Journal of the ACM (JACM)
Predicting Performance of Parallel Computations
IEEE Transactions on Parallel and Distributed Systems
A unified approach to models of synchronous parallel machines
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Parallelism in random access machines
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Scalable algorithms for adaptive statistical designs
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Parallel Dynamic Programming on Clusters of Workstations
IEEE Transactions on Parallel and Distributed Systems
Scalable algorithms for adaptive statistical designs
Scientific Programming
International Journal of Computational Science and Engineering
Hi-index | 0.00 |
We examine a very simple asynchronous model of parallel computation that assumes the time to compute a task is random, following some probability distribution. The goal of this model is to capture the effects of unpredictable delays on processors, due to communication delays or cache misses, for example. Using techniques from queueing theory and occupancy problems, we use this model to analyze two parallel dynamic programming algorithms. We show that this model is simple to analyze and correctly predicts which algorithm will perform better in practice. The algorithms we consider are a pipeline algorithm, where each processor i computes in order the entries of rows i, i + p, and so on, where p is the number of processors; and a diagonal algorithm, where entries along each diagonal extending from the left to the top of the table are computed in turn. It is likely that the techniques used here can be useful in the analysis of other algorithms that use barriers or pipelining techniques.