ICS '88 Proceedings of the 2nd international conference on Supercomputing
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Improving register allocation for subscripted variables
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Array privatization for parallel execution of loops
ICS '92 Proceedings of the 6th international conference on Supercomputing
Array-data flow analysis and its use in array privatization
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
Optimal code motion: theory and practice
ACM Transactions on Programming Languages and Systems (TOPLAS)
Advanced compiler design and implementation
Advanced compiler design and implementation
Automatic storage management for parallel programs
Parallel Computing - Special issues on languages and compilers for parallel computers
Optimizing memory usage in the polyhedral model
ACM Transactions on Programming Languages and Systems (TOPLAS)
A unified framework for schedule and storage optimization
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
On Privatization of Variables for Data-Parallel Execution
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Parallelization via Constrained Storage Mapping Optimization
ISHPC '99 Proceedings of the Second International Symposium on High Performance Computing
Storage Mapping Optimization for Parallel Programs
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
New Complexity Results on Array Contraction and Related Problems
Journal of VLSI Signal Processing Systems
Proceedings of the 20th annual international conference on Supercomputing
A practical automatic polyhedral parallelizer and locality optimizer
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Processor virtualization and split compilation for heterogeneous multicore embedded systems
Proceedings of the 47th Design Automation Conference
isl: an integer set library for the polyhedral model
ICMS'10 Proceedings of the Third international congress conference on Mathematical software
Register allocation for programs in SSA-Form
CC'06 Proceedings of the 15th international conference on Compiler Construction
Register allocation via coloring
Computer Languages
Automatic Parallelization: An Overview of Fundamental Compiler Techniques
Automatic Parallelization: An Overview of Fundamental Compiler Techniques
Hi-index | 0.00 |
To preserve the validity of loop nest transformations and parallelization, data dependences need to be analyzed. Memory dependences come in two varieties: true dependences or false dependences. While true dependences must be satisfied in order to preserve the correct order of computations, false dependences are induced by the reuse of a single memory location to store multiple values. False dependences reduce the degrees of freedom for loop transformations. In particular, loop tiling is severely limited in the presence of these dependences. While array expansion removes all false dependences, the overhead on memory and the detrimental impact on register-level reuse can be catastrophic. We propose and evaluate a compilation technique to safely ignore a large number of false dependences in order to enable loop nest tiling in the polyhedral model. It is based on the precise characterization of interferences between live range intervals, and it does not incur any scalar or array expansion. Our algorithms have been implemented in the Pluto polyhedral compiler, and evaluated on the PolyBench suite.