Forward communication only placements and their use for parallel program construction

Authors:
Martin Griebl;Paul Feautrier;Armin Größlinger
Affiliations:
FMI, University of Passau, Germany;Unité de Recherche de Rocquencourt, INRIA, France;FMI, University of Passau, Germany
Venue:
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Year:
2002

Citing 18
Cited 3

Theory of linear and integer programming

Theory of linear and integer programming
Stencils and problem partitionings: their influence on the performance of multiple processor systems

IEEE Transactions on Computers
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
(Pen)-ultimate tiling?

Integration, the VLSI Journal
Communication-minimal tiling of uniform dependence loops

Journal of Parallel and Distributed Computing
Maximizing parallelism and minimizing synchronization with affine partitions

Parallel Computing - Special issues on languages and compilers for parallel computers
Selecting tile shape for minimal execution time

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Reuse-driven tiling for improving data locality

International Journal of Parallel Programming
Index set splitting

International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
On tiling space-time mapped loop nests

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Mapping affine loop nests: new results

HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
A Precise Fixpoint Reaching Definition Analysis for Arrays

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Iteration Space Tiling for Memory Hierarchies

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing
Loop Parallelization in the Polytope Model

CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
Automatic Parallelization in the Polytope Model

The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications
Automatic Blocking of Nested Loops

Automatic Blocking of Nested Loops

Automatic code generation for distributed memory architectures in the polytope model

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Tiling stencil computations to maximize parallelism

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages

Quantified Score

Hi-index	0.00

Visualization

Abstract

The context of this paper is automatic parallelization by the space-time mapping method. One key issue in that approach is to adjust the granularity of the derived parallelism. For that purpose, we use tiling in the space and time dimensions. While space tiling is always legal, there are constraints on the possibility of time tiling, unless the placement is such that communications always go in the same direction (forward communications only). We derive an algorithm that automatically constructs an FCO placement – if it exists. We show that the method is applicable to many familiar kernels and that it gives satisfactory speedups.