Improved loop tiling based on the removal of spurious false dependences

Authors:
Riyadh Baghdadi;Albert Cohen;Sven Verdoolaege;Konrad Trifunović
Affiliations:
École Normale Supérieure and INRIA;École Normale Supérieure and INRIA;École Normale Supérieure and INRIA;École Normale Supérieure and INRIA
Venue:
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Year:
2013

Citing 26
Cited 0

Array expansion

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Improving register allocation for subscripted variables

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Array privatization for parallel execution of loops

ICS '92 Proceedings of the 6th international conference on Supercomputing
Array-data flow analysis and its use in array privatization

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
Optimal code motion: theory and practice

ACM Transactions on Programming Languages and Systems (TOPLAS)
Advanced compiler design and implementation

Advanced compiler design and implementation
Automatic storage management for parallel programs

Parallel Computing - Special issues on languages and compilers for parallel computers
Optimizing memory usage in the polyhedral model

ACM Transactions on Programming Languages and Systems (TOPLAS)
A unified framework for schedule and storage optimization

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
On Privatization of Variables for Data-Parallel Execution

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Automatic Array Privatization

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Parallelization via Constrained Storage Mapping Optimization

ISHPC '99 Proceedings of the Second International Symposium on High Performance Computing
Storage Mapping Optimization for Parallel Programs

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
New Complexity Results on Array Contraction and Related Problems

Journal of VLSI Signal Processing Systems
Violated dependence analysis

Proceedings of the 20th annual international conference on Supercomputing
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Register allocation: what does the NP-completeness proof of Chaitin et al. really prove? or revisiting register allocation: why and how

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Processor virtualization and split compilation for heterogeneous multicore embedded systems

Proceedings of the 47th Design Automation Conference
isl: an integer set library for the polyhedral model

ICMS'10 Proceedings of the Third international congress conference on Mathematical software
Register allocation for programs in SSA-Form

CC'06 Proceedings of the 15th international conference on Compiler Construction
Register allocation via coloring

Computer Languages
Automatic Parallelization: An Overview of Fundamental Compiler Techniques

Automatic Parallelization: An Overview of Fundamental Compiler Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

To preserve the validity of loop nest transformations and parallelization, data dependences need to be analyzed. Memory dependences come in two varieties: true dependences or false dependences. While true dependences must be satisfied in order to preserve the correct order of computations, false dependences are induced by the reuse of a single memory location to store multiple values. False dependences reduce the degrees of freedom for loop transformations. In particular, loop tiling is severely limited in the presence of these dependences. While array expansion removes all false dependences, the overhead on memory and the detrimental impact on register-level reuse can be catastrophic. We propose and evaluate a compilation technique to safely ignore a large number of false dependences in order to enable loop nest tiling in the polyhedral model. It is based on the precise characterization of interferences between live range intervals, and it does not incur any scalar or array expansion. Our algorithms have been implemented in the Pluto polyhedral compiler, and evaluated on the PolyBench suite.