Speedup Versus Efficiency in Parallel Systems
IEEE Transactions on Computers
Space-efficient scheduling of multithreaded computations
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Provably efficient scheduling for languages with fine-grained parallelism
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Guaranteeing Good Memory Bounds for Parallel Programs
IEEE Transactions on Software Engineering
Executing multithreaded programs efficiently
Executing multithreaded programs efficiently
Space-efficient scheduling of parallelism with synchronization variables
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Storage Management in Virtual Tree Machines
IEEE Transactions on Computers
Executing functional programs on a virtual tree of processors
FPCA '81 Proceedings of the 1981 conference on Functional programming languages and computer architecture
Proceedings of the ACM 2000 conference on Java Grande
Hi-index | 0.00 |
We model a deterministic parallel program by a directed acyclic graph of tasks, where a task can execute as soon as all tasks preceding it have been executed. Each task can allocate or release an arbitrary amount of memory (i.e., heap memory allocation can be modeled). We call a parallel schedule "space efficient" if the amount of memory required is at most equal to the number of processors times the amount of memory required for some depth-first execution of the program by a single processor. We will describe a simple, locally depth-first, scheduling algorithm and show that it is always space efficient. Since the scheduling algorithm is greedy, it will be within a factor of two of being optimal with respect to time. For the special case of a program having a series-parallel structure, we show how to efficiently compute the worst case memory requirements over all possible depth-first executions of a program. Finally, we show how scheduling can be decentralized, making the approach scalable to a large number of processors when there is sufficient parallelism.