Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The design of the UNIX operating system
The design of the UNIX operating system
Compile-time partitioning and scheduling of parallel programs
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
IEEE Transactions on Computers
SIAM Journal on Computing
A study of scalar compilation techniques for pipelined supercomputers
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Performance Prediction and Calibration for a Class of Multiprocessors
IEEE Transactions on Computers
On the Minimization of Loads/Stores in Local Register Allocation
IEEE Transactions on Software Engineering
Conversion of simulation processes to Pascal constructs
Software—Practice & Experience
Employing register channels for the exploitation of instruction level parallelism
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
A variable instruction stream extension to the VLIW architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Inline routines in VAXELN Pascal
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Concurrency preserving partitioning (CPP) for parallel logic simulation
PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
The validation of a multiprocessor simulator
WSC '93 Proceedings of the 25th conference on Winter simulation
Scheduling Multiprocessor Tasks with Genetic Algorithms
IEEE Transactions on Parallel and Distributed Systems
Clustering Algorithm for Parallelizing Software Systems in Multiprocessors Environment
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Profiling and mapping of parallel workloads on network processors
Proceedings of the 2005 ACM symposium on Applied computing
Design considerations for network processor operating systems
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Computers and Operations Research
Automatic choice of scheduling heuristics for parallel/distributed computing
Scientific Programming
Analytic modeling of network processors for parallel workload mapping
ACM Transactions on Embedded Computing Systems (TECS)
Replication-based partial dynamic scheduling on heterogeneous network processors
APPT'07 Proceedings of the 7th international conference on Advanced parallel processing technologies
Task assignment for network processor pipelines using GA
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Stochastic DFS for multiprocessor scheduling of cyclic taskgraphs
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Hi-index | 0.00 |
A new approach is given for scheduling a sequential instruction stream for execution "inparallel" on asynchronous multiprocessors. The key idea in our approach is to exploit thefine grained parallelism present in the instruction stream. In this context, schedules areconstructed by a careful balancing of execution and communication costs at the level ofindividual instructions, and their data dependencies. Three methods are used to evaluateour approach. First, several existing methods are extended to the fine grained situation.Our approach is then compared to these methods using both static schedule lengthanalyses, and simulated executions of the scheduled code. In each instance, our methodis found to provide significantly shorter schedules. Second, by varying parameters suchas the speed of the instruction set, and the speed/parallelism in the interconnectionstructure, simulation techniques are used to examine the effects of various architecturalconsiderations on the executions of the schedules. These results show that our approachprovides significant speedups in a wide-range of situations. Third, schedules produced byour approach are executed on a two-processor Data General shared memorymultiprocessor system. These experiments show that there is a strong correlationbetween our simulation results, and these actual executions, and thereby serve tovalidate the simulation studies. Together, our results establish that fine grainedparallelism can be exploited in a substantial manner when scheduling a sequentialinstruction stream for execution "in parallel" on asynchronous multiprocessors.