Loop Transformations for Hierarchical Parallelism and Locality

Authors:
Vivek Sarkar
Affiliations:
-
Venue:
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Year:
1998

Citing 15
Cited 1

Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
A general framework for iteration-reordering loop transformations

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Data-centric multi-level blocking

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Automatic selection of high-order transformations in the IBM XL FORTRAN compilers

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
The parallel execution of DO loops

Communications of the ACM
Optimizing Supercompilers for Supercomputers

Optimizing Supercompilers for Supercomputers
Dependence Analysis for Supercomputing

Dependence Analysis for Supercomputing
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Locality Analysis for Distributed Shared-Memory Multiprocessors

LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
Quantifying the Multi-level Nature of Tiling Interactions

LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
On Estimating and Enhancing Cache Effectiveness

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Automatic parallelization for symmetric shared-memory multiprocessors

CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research

Optimized Execution of Fortran 90 Array Language on Symmetric Shared-Memory Multiprocessors

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing depth of memory and parallelism hierarchies in future scalable computer systems poses many challenges to parallelizing compilers. In this paper, we address the problem of selecting and implementing iteration-reordering loop transformations for hierarchical parallelism and locality. We present a two-pass algorithm for selecting sequences of Block, Unimodular, Parallel, and Coalesce transformations for optimizing locality and parallelism for a specified parallelism hierarchy model. These general transformation sequences are implemented using a framework for iteration-reordering loop transformations that we developed in past work [15].