Compile-time partitioning and scheduling of parallel programs
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Theory of linear and integer programming
Theory of linear and integer programming
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Simulated annealing: theory and applications
Simulated annealing: theory and applications
Determining average program execution times and their variance
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Utilizing Multidimensional Loop Parallelism on Large Scale Parallel Processor Systems
IEEE Transactions on Computers
A comparison of list schedules for parallel processing systems
Communications of the ACM
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Counting solutions to Presburger formulas: how and why
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Combining static and dynamic scheduling on distributed-memory multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
A simple algorithm for the generation of efficient loop structures
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Symbolic analysis for parallelizing compilers
ACM Transactions on Programming Languages and Systems (TOPLAS)
ICS '96 Proceedings of the 10th international conference on Supercomputing
Parametric Analysis of Polyhedral Iteration Spaces
Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
Counting the solutions of Presburger equations without enumerating them
Theoretical Computer Science - Implementation and application automata
Hi-index | 0.01 |
This paper is concerned with the automatic exploitation of the parallelism detected in a sequential program. The target machine is a shared memory multiprocessor.The main goal is minimizing the completion time of the program. To achieve this, one has first to distribute the code over the processors, then to schedule the parts of the code in order to minimize the execution time while preserving the semantics. This problem is NP-complete.Loop scheduling and processor allocation are the main problems. However we are also able to deal with so-called control parallelism. Allocation and scheduling are performed at compile time. For a given processor allocation, we use list scheduling algorithm to compute the elapsed time, which is then optimized by the Tabu heuristic.The estimation of each component execution time is based on knowledge of average execution time of the operators and built-in functions and on the estimation of iteration space size.Experimentations on the Encore-Multimax machine show that on a representative set of scientific programs, the efficiency we obtained is in almost all the cases greater than 80%, as soon as the problem size is large enough.