Theory of linear and integer programming
Theory of linear and integer programming
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
Mapping uniform loop nests onto distributed memory architectures
Parallel Computing
Automating non-unimodular loop transformations for massive parallelism
Parallel Computing
A singular loop transformation framework based on non-singular matrices
International Journal of Parallel Programming
Beyond unimodular transformations
The Journal of Supercomputing
ICS '96 Proceedings of the 10th international conference on Supercomputing
Transformations of nested loops with non-convex iteration spaces
Parallel Computing
Loop parallelization algorithms: from parallelism extraction to code generation
Parallel Computing - Special issues on languages and compilers for parallel computers
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
Automatic discovery of linear restraints among variables of a program
POPL '78 Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Structure of Computers and Computations
Structure of Computers and Computations
Code generation for multiple mappings
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Code Generation in the Polytope Model
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Efficient code generation for automatic parallelization and optimization
ISPDC'03 Proceedings of the Second international conference on Parallel and distributed computing
Facilitating the search for compositions of program transformations
Proceedings of the 19th annual international conference on Supercomputing
In search of a program generator to implement generic transformations for high-performance computing
Science of Computer Programming - Special issue on the first MetaOCaml workshop 2004
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies
International Journal of Parallel Programming
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Parameterized tiled loops for free
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time
Proceedings of the International Symposium on Code Generation and Optimization
MPSoC memory optimization using program transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
pn: a tool for improved derivation of process networks
EURASIP Journal on Embedded Systems
Multi-level tiling: M for the price of one
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
AIC'05 Proceedings of the 5th WSEAS International Conference on Applied Informatics and Communications
A compiler framework for optimization of affine loop nests for gpgpus
Proceedings of the 22nd annual international conference on Supercomputing
Iterative optimization in the polyhedral model: part ii, multidimensional time
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A practical automatic polyhedral parallelizer and locality optimizer
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A domain specific interconnect for reconfigurable computing
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Finding Synchronization-Free Parallelism Represented with Trees of Dependent Operations
ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Explicit Dependence Metadata in an Active Visual Effects Library
Languages and Compilers for Parallel Computing
Journal of Signal Processing Systems
Model Transformations for the Compilation of Multi-processor Systems-on-Chip
Generative and Transformational Techniques in Software Engineering II
Trade-offs in loop transformations
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Finding and Applying Loop Transformations for Generating Optimized FPGA Implementations
Transactions on High-Performance Embedded Architectures and Compilers I
Periodic register saturation in innermost loops
Parallel Computing
Precise Management of Scratchpad Memories for Localising Array Accesses in Scientific Codes
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Parametric multi-level tiling of imperfectly nested loops
Proceedings of the 23rd international conference on Supercomputing
Equivalence Checking of Static Affine Programs Using Widening to Handle Recurrences
CAV '09 Proceedings of the 21st International Conference on Computer Aided Verification
A Holistic Approach towards Automated Performance Analysis and Tuning
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
SARA: StreAm register allocation
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Exact join detection for convex polyhedra and other numerical abstractions
Computational Geometry: Theory and Applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
isl: an integer set library for the polyhedral model
ICMS'10 Proceedings of the Third international congress conference on Mathematical software
Loop transformations: convexity, pruning and optimization
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Polyhedral Model Based Data Locality Optimization for Embedded Applications
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Mechanisms that separate algorithms from implementations for parallel patterns
Proceedings of the 2010 Workshop on Parallel Programming Patterns
PERCS: the IBM power7-IH high-performance computing system
IBM Journal of Research and Development
ompVerify: polyhedral analysis for the OpenMP programmer
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Modeling adaptive streaming applications with parameterized polyhedral process networks
Proceedings of the 48th Design Automation Conference
Transitive closures of affine integer tuple relations and their overapproximations
SAS'11 Proceedings of the 18th international conference on Static analysis
A Model-Driven Design Framework for Massively Parallel Embedded Systems
ACM Transactions on Embedded Computing Systems (TECS)
Adaptive runtime selection of parallel schedules in the polytope model
Proceedings of the 19th High Performance Computing Symposia
Polyhedral parallelization of binary code
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Towards a tighter integration of generated and custom-made hardware
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
Identifying hotspots in a program for data parallel architecture: an early experience
Proceedings of the 5th India Software Engineering Conference
Optimizing SDRAM bandwidth for custom FPGA loop accelerators
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Synchronization-Free automatic parallelization: beyond affine iteration-space slicing
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Efficient tiled loop generation: D-tiling
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic C-to-CUDA code generation for affine programs
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
The polyhedral model is more widely applicable than you think
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Polyhedral code generation in the real world
CC'06 Proceedings of the 15th international conference on Compiler Construction
Predictive modeling in a polyhedral optimization space
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
On-chip cache hierarchy-aware tile scheduling for multicore machines
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Optimizing I/O for big array analytics
Proceedings of the VLDB Endowment
Synthesising graphics card programs from DSLs
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Scan detection and parallelization in "inherently sequential" nested loop programs
Proceedings of the Tenth International Symposium on Code Generation and Optimization
VMAD: an advanced dynamic program analysis and instrumentation framework
CC'12 Proceedings of the 21st international conference on Compiler Construction
Using free scheduling for programming graphic cards
Facing the Multicore-Challenge II
Extracting coarse-grained parallelism for affine perfectly nested quasi-uniform loops
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Equivalence checking of static affine programs using widening to handle recurrences
ACM Transactions on Programming Languages and Systems (TOPLAS)
A multi-objective auto-tuning framework for parallel codes
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Polyhedral parallel code generation for CUDA
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
From serial loops to parallel execution on distributed systems
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Polyhedral-based data reuse optimization for configurable computing
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
C-to-CoRAM: compiling perfect loop nests to the portable CoRAM abstraction
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Array dataflow analysis for polyhedral X10 programs
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
PolyGLoT: a polyhedral loop transformation framework for a graphical dataflow language
CC'13 Proceedings of the 22nd international conference on Compiler Construction
Memory reuse optimizations in the R-Stream compiler
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
When polyhedral transformations meet SIMD code generation
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
On supernode transformations and multithreading for the longest common subsequence problem
AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
Generating efficient data movement code for heterogeneous architectures with distributed-memory
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
HYDA: A HYbrid Dependence Analysis for the adaptive optimisation of OpenCL kernels
Proceedings of International Workshop on Adaptive Self-tuning Computing Systems
ACM Transactions on Architecture and Code Optimization (TACO)
GPU code generation for ODE-based applications with phased shared-data access patterns
ACM Transactions on Architecture and Code Optimization (TACO)
Improving polyhedral code generation for high-level synthesis
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Leveraging GPUs using cooperative loop speculation
ACM Transactions on Architecture and Code Optimization (TACO)
A Case Study of Implementing Supernode Transformations
International Journal of Parallel Programming
Hi-index | 0.00 |
Many advances in automatic parallelization and optimization have been achieved through the polyhedral model. It has been extensively shown that this computational model provides convenient abstractions to reason about and apply program transformations. Nevertheless, the complexity of code generation has long been a deterrent for using polyhedral representation in optimizing compilers. First, code generators have a hard time coping with generated code size and control overhead that may spoil theoretical benefits achieved by the transformations. Second, this step is usually time consuming, hampering the integration of the polyhedral framework in production compilers or feedback-directed, iterative optimization schemes. Moreover, current code generation algorithms only cover a restrictive set of possible transformation functions. This paper discusses a general transformation framework able to deal with non-unimodular, non-invertible, non-integral or even non-uniform functions. It presents several improvements to a state-of-the-art code generation algorithm. Two directions are explored: generated code size and code generator efficiency. Experimental evidence proves the ability of the improved method to handle real-life problems.