Space-Efficient Scheduling of Multithreaded Computations

Authors:
Robert D. Blumofe;Charles E. Leiserson
Affiliations:
-;-
Venue:
SIAM Journal on Computing
Year:
1998

Citing 0
Cited 29

BOS is boss: a case for bulk-synchronous object systems

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Scheduling multithreaded computations by work stealing

Journal of the ACM (JACM)
An operational semantics for parallel lazy evaluation

ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
HiHCoHP: Toward a Realistic Communication Model for Hierarchical HyperClusters of Heterogeneous Processors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Methodology of Parallelization for Continuous Verified Global Optimization

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
A New Scheduling Algorithm for General Strict Multithreaded Computations

Proceedings of the 13th International Symposium on Distributed Computing
Cited References

Computer algebra handbook
Value-maximizing deadline scheduling and its application to animation rendering

Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Adaptive scheduling with parallelism feedback

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work stealing with parallelism feedback

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Re-scheduling invocations of services for RPC grids

Computer Languages, Systems and Structures
Cost semantics for space usage in a parallel language

Proceedings of the 2007 workshop on Declarative aspects of multicore programming
KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors

Proceedings of the 2007 international workshop on Parallel symbolic computation
Productivity and performance using partitioned global address space languages

Proceedings of the 2007 international workshop on Parallel symbolic computation
Multi-threading and one-sided communication in parallel LU factorization

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Adaptive work-stealing with parallelism feedback

ACM Transactions on Computer Systems (TOCS)
Space profiling for parallel functional programs

Proceedings of the 13th ACM SIGPLAN international conference on Functional programming
Improved results for scheduling batched parallel jobs by using a generalized analysis framework

Journal of Parallel and Distributed Computing
The Cilk++ concurrency platform

The Journal of Supercomputing
Provably efficient two-level adaptive scheduling

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
Scheduling dynamically spawned processes in MPI-2

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
The Cilkview scalability analyzer

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers)

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Area-maximizing schedules for series-parallel DAGs

Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Space profiling for parallel functional programs

Journal of Functional Programming
Adaptive encoding of multimedia streams on MPSoC

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Time complexity of distributed topological self-stabilization: the case of graph linearization

LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
On-the-fly pipeline parallelism

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Well-structured futures and cache locality

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for a single-processor execution. Utilizing a new graph-theoretic model of multithreaded computation, execution efficiency is quantified by three important measures: T1 is the time required for executing the computation on a 1 processor, $T_\infty$ is the time required by an infinite number of processors, and S1 is the space required to execute the computation on a 1 processor. A computation executed on P processors is time-efficient if the time is $O(T_1/P + T_\infty)$, that is, it achieves linear speedup when $P=O(T_1/T_\infty)$, and it is space-efficient if it uses O(S1P) total space, that is, the space per processor is within a constant factor of that required for a 1-processor execution.The first result derived from this model shows that there exist multithreaded computations such that no execution schedule can simultaneously achieve efficient time and efficient space. But by restricting attention to "strict" computations---those in which all arguments to a procedure must be available before the procedure can be invoked---much more positive results are obtainable. Specifically, for any strict multithreaded computation, a simple online algorithm can compute a schedule that is both time-efficient and space-efficient. Unfortunately, because the algorithm uses a global queue, the overhead of computing the schedule can be substantial. This problem is overcome by a decentralized algorithm that can compute and execute a P-processor schedule online in expected time $O(T_1/P + T_\infty\lg P)$ and worst-case space $O(S_1P\lg P)$, including overhead costs.