Proceedings of the 14th international conference on Supercomputing
Data locality enhancement by memory reduction
ICS '01 Proceedings of the 15th international conference on Supercomputing
Space-time trade-off optimization for a class of electronic structure calculations
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
International Journal of Parallel Programming
Complexity of Multi-dimensional Loop Alignment
STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Improving Data Locality by Array Contraction
IEEE Transactions on Computers
General loop fusion technique for nested loops considering timing and code size
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Exploiting Vector Parallelism in Software Pipelined Loops
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Improving locality for ODE solvers by program transformations
Scientific Programming
MPSoC memory optimization using program transformation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications
Journal of Signal Processing Systems
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Locality enhancement by array contraction
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Model-guided empirical tuning of loop fusion
International Journal of High Performance Systems Architecture
DESOLA: An active linear algebra library using delayed evaluation and runtime code generation
Science of Computer Programming
Loop Distribution and Fusion with Timing and Code Size Optimization
Journal of Signal Processing Systems
A cache-conscious profitability model for empirical tuning of loop fusion
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Iterative collective loop fusion
CC'06 Proceedings of the 15th international conference on Compiler Construction
Integrating Memory Optimization with Mapping Algorithms for Multi-Processors System-on-Chip
ACM Transactions on Embedded Computing Systems (TECS)
Revisiting loop fusion in the polyhedral framework
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical point of view, several variants of the loop fusion problem -- identifying polynomially solvable cases and NP-complete cases -- and to make the link between these problems and some scheduling problems that arise from completely different areas. We study, among others, the fusion of loops of different types, and the fusion of loops when combined with loop shifting.