Scheduling DAG's for Asynchronous Multiprocessor Execution

Authors:
B. A. Malloy;E. L. Lloyd;M. L. Soffa
Affiliations:
-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1994

Citing 15
Cited 12

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
The design of the UNIX operating system

The design of the UNIX operating system
Compile-time partitioning and scheduling of parallel programs

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
The Effects of Problem Partitioning, Allocation, and Granularity on the Performance of Multiple-Processor Systems

IEEE Transactions on Computers
A communication-time tradeoff

SIAM Journal on Computing
A study of scalar compilation techniques for pipelined supercomputers

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Performance Prediction and Calibration for a Class of Multiprocessors

IEEE Transactions on Computers
A Survey of Synchronization Methods for Parallel Computers

Computer
On the Minimization of Loads/Stores in Local Register Allocation

IEEE Transactions on Software Engineering
Conversion of simulation processes to Pascal constructs

Software—Practice & Experience
Employing register channels for the exploitation of instruction level parallelism

PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
A variable instruction stream extension to the VLIW architecture

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Inline routines in VAXELN Pascal

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness

Concurrency preserving partitioning (CPP) for parallel logic simulation

PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
The validation of a multiprocessor simulator

WSC '93 Proceedings of the 25th conference on Winter simulation
Scheduling Multiprocessor Tasks with Genetic Algorithms

IEEE Transactions on Parallel and Distributed Systems
Clustering Algorithm for Parallelizing Software Systems in Multiprocessors Environment

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Profiling and mapping of parallel workloads on network processors

Proceedings of the 2005 ACM symposium on Applied computing
Design considerations for network processor operating systems

Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Benchmark-problem instances for static scheduling of task graphs with communication delays on homogeneous multiprocessor systems

Computers and Operations Research
Automatic choice of scheduling heuristics for parallel/distributed computing

Scientific Programming
Analytic modeling of network processors for parallel workload mapping

ACM Transactions on Embedded Computing Systems (TECS)
Replication-based partial dynamic scheduling on heterogeneous network processors

APPT'07 Proceedings of the 7th international conference on Advanced parallel processing technologies
Task assignment for network processor pipelines using GA

APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Stochastic DFS for multiprocessor scheduling of cyclic taskgraphs

PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new approach is given for scheduling a sequential instruction stream for execution "inparallel" on asynchronous multiprocessors. The key idea in our approach is to exploit thefine grained parallelism present in the instruction stream. In this context, schedules areconstructed by a careful balancing of execution and communication costs at the level ofindividual instructions, and their data dependencies. Three methods are used to evaluateour approach. First, several existing methods are extended to the fine grained situation.Our approach is then compared to these methods using both static schedule lengthanalyses, and simulated executions of the scheduled code. In each instance, our methodis found to provide significantly shorter schedules. Second, by varying parameters suchas the speed of the instruction set, and the speed/parallelism in the interconnectionstructure, simulation techniques are used to examine the effects of various architecturalconsiderations on the executions of the schedules. These results show that our approachprovides significant speedups in a wide-range of situations. Third, schedules produced byour approach are executed on a two-processor Data General shared memorymultiprocessor system. These experiments show that there is a strong correlationbetween our simulation results, and these actual executions, and thereby serve tovalidate the simulation studies. Together, our results establish that fine grainedparallelism can be exploited in a substantial manner when scheduling a sequentialinstruction stream for execution "in parallel" on asynchronous multiprocessors.