Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Guide to parallel programming on Sequent computer systems: 2nd edition
Guide to parallel programming on Sequent computer systems: 2nd edition
Multiprocessor Synchronization for Concurrent Loops
IEEE Software
Dependence analysis for subscripted variables and its application to program transformations
Dependence analysis for subscripted variables and its application to program transformations
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Parallel simplex algorithms and loop spreading
Parallel simplex algorithms and loop spreading
Loop displacement: an approach for transforming and scheduling loops for parallel execution
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
When the number of processors P is less than the number of tasks N in a parallel loop, the loop has to be executed in ⌈N/P⌉ rounds and the last round executes only (N mod P) tasks. In many cases, in the last round all but a few processors are idle, which causes a significant drop in performance. This performance drop becomes more and more detrimental as the number of processors increases. Loop spreading is a technique for restructuring parallel loops so as to balance parallel tasks on multiple processors. A spread loop runs at least as fast as the non-spread loop even when N mod P = 0, and shows no performance drop when N changes. We show how the method keeps the performance of the matrix multiplication and a simplex algorithm from decreasing as the size of input changes.