Code Generation in the Polyhedral Model Is Easier Than You Think

Authors:
Cedric Bastoul
Affiliations:
Université de Versailles Saint Quentin, France
Venue:
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Year:
2004

Citing 17
Cited 83

Theory of linear and integer programming

Theory of linear and integer programming
Scanning polyhedra with DO loops

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
Mapping uniform loop nests onto distributed memory architectures

Parallel Computing
Automating non-unimodular loop transformations for massive parallelism

Parallel Computing
A singular loop transformation framework based on non-singular matrices

International Journal of Parallel Programming
Beyond unimodular transformations

The Journal of Supercomputing
Counting solutions to linear and nonlinear constraints through Ehrhart polynomials: applications to analyze and transform scientific programs

ICS '96 Proceedings of the 10th international conference on Supercomputing
Transformations of nested loops with non-convex iteration spaces

Parallel Computing
Loop parallelization algorithms: from parallelism extraction to code generation

Parallel Computing - Special issues on languages and compilers for parallel computers
Generation of Efficient Nested Loops from Polyhedra

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Index set splitting

International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
Automatic discovery of linear restraints among variables of a program

POPL '78 Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Structure of Computers and Computations

Structure of Computers and Computations
Code generation for multiple mappings

FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Code Generation in the Polytope Model

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Efficient code generation for automatic parallelization and optimization

ISPDC'03 Proceedings of the Second international conference on Parallel and distributed computing

Facilitating the search for compositions of program transformations

Proceedings of the 19th annual international conference on Supercomputing
In search of a program generator to implement generic transformations for high-performance computing

Science of Computer Programming - Special issue on the first MetaOCaml workshop 2004
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies

International Journal of Parallel Programming
Violated dependence analysis

Proceedings of the 20th annual international conference on Supercomputing
The Z-polyhedral model

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Parameterized tiled loops for free

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time

Proceedings of the International Symposium on Code Generation and Optimization
MPSoC memory optimization using program transformation

ACM Transactions on Design Automation of Electronic Systems (TODAES)
pn: a tool for improved derivation of process networks

EURASIP Journal on Embedded Systems
Multi-level tiling: M for the price of one

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Finding free schedules for parameterized loops with affine dependences represented with a single dependence relation

AIC'05 Proceedings of the 5th WSEAS International Conference on Applied Informatics and Communications
A compiler framework for optimization of affine loop nests for gpgpus

Proceedings of the 22nd annual international conference on Supercomputing
Iterative optimization in the polyhedral model: part ii, multidimensional time

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A domain specific interconnect for reconfigurable computing

Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Finding Synchronization-Free Parallelism Represented with Trees of Dependent Operations

ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops

ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Explicit Dependence Metadata in an Active Visual Effects Library

Languages and Compilers for Parallel Computing
Storage Estimation and Design Space Exploration Methodologies for the Memory Management of Signal Processing Applications

Journal of Signal Processing Systems
Model Transformations for the Compilation of Multi-processor Systems-on-Chip

Generative and Transformational Techniques in Software Engineering II
Trade-offs in loop transformations

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Finding and Applying Loop Transformations for Generating Optimized FPGA Implementations

Transactions on High-Performance Embedded Architectures and Compilers I
Periodic register saturation in innermost loops

Parallel Computing
Precise Management of Scratchpad Memories for Localising Array Accesses in Scientific Codes

CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Parametric multi-level tiling of imperfectly nested loops

Proceedings of the 23rd international conference on Supercomputing
Equivalence Checking of Static Affine Programs Using Widening to Handle Recurrences

CAV '09 Proceedings of the 21st International Conference on Computer Aided Verification
A Holistic Approach towards Automated Performance Analysis and Tuning

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
SARA: StreAm register allocation

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Exact join detection for convex polyhedra and other numerical abstractions

Computational Geometry: Theory and Applications
Symbolic polynomial maximization over convex sets and its application to memory requirement estimation

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model

CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
isl: an integer set library for the polyhedral model

ICMS'10 Proceedings of the Third international congress conference on Mathematical software
Loop transformations: convexity, pruning and optimization

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Polyhedral Model Based Data Locality Optimization for Embedded Applications

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Mechanisms that separate algorithms from implementations for parallel patterns

Proceedings of the 2010 Workshop on Parallel Programming Patterns
PERCS: the IBM power7-IH high-performance computing system

IBM Journal of Research and Development
Coarse-grained loop parallelization: Iteration Space Slicing vs affine transformations

Parallel Computing
ompVerify: polyhedral analysis for the OpenMP programmer

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Modeling adaptive streaming applications with parameterized polyhedral process networks

Proceedings of the 48th Design Automation Conference
Transitive closures of affine integer tuple relations and their overapproximations

SAS'11 Proceedings of the 18th international conference on Static analysis
A Model-Driven Design Framework for Massively Parallel Embedded Systems

ACM Transactions on Embedded Computing Systems (TECS)
Adaptive runtime selection of parallel schedules in the polytope model

Proceedings of the 19th High Performance Computing Symposia
Polyhedral parallelization of binary code

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Towards a tighter integration of generated and custom-made hardware

ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
Identifying hotspots in a program for data parallel architecture: an early experience

Proceedings of the 5th India Software Engineering Conference
Optimizing SDRAM bandwidth for custom FPGA loop accelerators

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Synchronization-Free automatic parallelization: beyond affine iteration-space slicing

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Efficient tiled loop generation: D-tiling

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Parameterized loop tiling

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic C-to-CUDA code generation for affine programs

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
The polyhedral model is more widely applicable than you think

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Polyhedral code generation in the real world

CC'06 Proceedings of the 15th international conference on Compiler Construction
Predictive modeling in a polyhedral optimization space

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
On-chip cache hierarchy-aware tile scheduling for multicore machines

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Optimizing I/O for big array analytics

Proceedings of the VLDB Endowment
Synthesising graphics card programs from DSLs

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Polyhedra scanning revisited

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Scan detection and parallelization in "inherently sequential" nested loop programs

Proceedings of the Tenth International Symposium on Code Generation and Optimization
VMAD: an advanced dynamic program analysis and instrumentation framework

CC'12 Proceedings of the 21st international conference on Compiler Construction
Using free scheduling for programming graphic cards

Facing the Multicore-Challenge II
Free scheduling for statement instances of parameterized arbitrarily nested affine loops

Parallel Computing
Extracting coarse-grained parallelism for affine perfectly nested quasi-uniform loops

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Equivalence checking of static affine programs using widening to handle recurrences

ACM Transactions on Programming Languages and Systems (TOPLAS)
A multi-objective auto-tuning framework for parallel codes

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Polyhedral parallel code generation for CUDA

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
From serial loops to parallel execution on distributed systems

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Polyhedral-based data reuse optimization for configurable computing

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
C-to-CoRAM: compiling perfect loop nests to the portable CoRAM abstraction

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Array dataflow analysis for polyhedral X10 programs

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
PolyGLoT: a polyhedral loop transformation framework for a graphical dataflow language

CC'13 Proceedings of the 22nd international conference on Compiler Construction
Memory reuse optimizations in the R-Stream compiler

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
When polyhedral transformations meet SIMD code generation

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
On supernode transformations and multithreading for the longest common subsequence problem

AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
Generating efficient data movement code for heterogeneous architectures with distributed-memory

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
HYDA: A HYbrid Dependence Analysis for the adaptive optimisation of OpenCL kernels

Proceedings of International Workshop on Adaptive Self-tuning Computing Systems
Tile size selection revisited

ACM Transactions on Architecture and Code Optimization (TACO)
GPU code generation for ODE-based applications with phased shared-data access patterns

ACM Transactions on Architecture and Code Optimization (TACO)
Improving polyhedral code generation for high-level synthesis

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Leveraging GPUs using cooperative loop speculation

ACM Transactions on Architecture and Code Optimization (TACO)
A Case Study of Implementing Supernode Transformations

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many advances in automatic parallelization and optimization have been achieved through the polyhedral model. It has been extensively shown that this computational model provides convenient abstractions to reason about and apply program transformations. Nevertheless, the complexity of code generation has long been a deterrent for using polyhedral representation in optimizing compilers. First, code generators have a hard time coping with generated code size and control overhead that may spoil theoretical benefits achieved by the transformations. Second, this step is usually time consuming, hampering the integration of the polyhedral framework in production compilers or feedback-directed, iterative optimization schemes. Moreover, current code generation algorithms only cover a restrictive set of possible transformation functions. This paper discusses a general transformation framework able to deal with non-unimodular, non-invertible, non-integral or even non-uniform functions. It presents several improvements to a state-of-the-art code generation algorithm. Two directions are explored: generated code size and code generator efficiency. Experimental evidence proves the ability of the improved method to handle real-life problems.