Tiling and Memory Reuse for Sequences of Nested Loops

Authors:
Youcef Bouchebaba;Fabien Coelho
Affiliations:
-;-
Venue:
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Year:
2002

Citing 11
Cited 2

Strategies for cache and local memory management by global program transformation

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Supercompilers for parallel and vector computers

Supercompilers for parallel and vector computers
Improving locality and parallelism in nested loops

Improving locality and parallelism in nested loops
Combining loop transformations considering caches and scheduling

International Journal of Parallel Programming - Special issue: MICRO-29, 29th annual IEEE/ACM international symposium on microarchitecture
On the complexity of loop fusion

Parallel Computing - Special issue on new trends on scheduling in parallel and distributed systems
Generation of Efficient Nested Loops from Polyhedra

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Loop Shifting for Loop Compaction

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Optimizing memory usage in the polyhedral model

ACM Transactions on Programming Languages and Systems (TOPLAS)
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design

Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Iteration Space Slicing for Locality

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing

Buffer and Register Allocation for Memory Space Optimization

Journal of VLSI Signal Processing Systems
Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Our aim is to minimize the electrical energy used during the execution of signal processing applications that are a sequence of loop nests. This energy is mostly used to transfer data among various levels of memory hierarchy. To minimize these transfers, we transform these programs by using simultaneously loop permutation, tiling, loop fusion with shifting and memory reuse. Each input nest uses a stencil of data produced in the previous nest and the references to the same array are equal, up to a shift. All transformations described in this paper have been implemented in pips, our optimizing compiler and cache misses reductions have been measured.