(R) Scheduling of Wavefront Parallelism on Scalable Shared-memory Multiprocessors

Authors:
Affiliations:
Venue:
ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
Year:
1996

Citing 0
Cited 4

Fusion of Loops for Parallelism and Locality

IEEE Transactions on Parallel and Distributed Systems
Locality Enhancement for Large-Scale Shared-Memory Multiprocessors

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Patterns of optimized loops

Proceedings of the 2010 Workshop on Parallel Programming Patterns
Locality optimizations for jacobi iteration on distributed parallel systems

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelism to wavefronts in the tiled iteration space. For a small number of processors, wavefront parallelism can be efficiently exploited using dynamic self-scheduling with a large tile size. Such a strategy enhances intratile locality, but does not necessarily enhance intertile locality. We show that dynamic self-scheduling performs poorly on scalable shared-memory multiprocessors where smaller tiles are necessary to provide sufficient parallelism-smaller tiles place greater importance on intertile locality. We propose static scheduling strategies which enhance intertile locality for small tiles. Results of experiments on a Convex SPP1000 multiprocessor demonstrate that our strategies outperform dynamic self-scheduling by a factor of up to 2.3 on 30 processors.