Optimizing local memory allocation and assignment through a decoupled approach

Authors:
Boubacar Diouf;Ozcan Ozturk;Albert Cohen
Affiliations:
INRIA Saclay and Paris-Sud 11 University;Bilkent University;INRIA Saclay and Paris-Sud 11 University
Venue:
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Year:
2009

Citing 21
Cited 2

Data-centric multi-level blocking

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Array SSA form and its use in parallelization

POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimal spilling for CISC machines with few registers

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Dynamic management of scratch-pad memory space

Proceedings of the 38th annual Design Automation Conference
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
An optimal memory allocation scheme for scratch-pad-based embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Automatic storage optimization

SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Array Regrouping and Its Use in Compiling Data-Intensive Embedded Applications

IEEE Transactions on Computers
Memory Coloring: A Compiler Approach for Scratchpad Memory Management

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Dynamic allocation for scratch-pad memory using compile-time decisions

ACM Transactions on Embedded Computing Systems (TECS)
Region array SSA

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
DRDU: A data reuse analysis technique for efficient scratch-pad memory management

ACM Transactions on Design Automation of Electronic Systems (TODAES)
On the Complexity of Register Coalescing

Proceedings of the International Symposium on Code Generation and Optimization
Recursive function data allocation to scratch-pad memory

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Register allocation by puzzle solving

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Advanced conservative and optimistic register coalescing

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Revisiting Out-of-SSA Translation for Correctness, Code Quality and Efficiency

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Register allocation: what does the NP-completeness proof of Chaitin et al. really prove? or revisiting register allocation: why and how

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Register allocation for programs in SSA-Form

CC'06 Proceedings of the 15th international conference on Compiler Construction

Practical loop transformations for tensor contraction expressions on multi-level memory hierarchies

CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
A decoupled local memory allocator

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software-controlled local memories (LMs) are widely used to provide fast, scalable, power efficient and predictable access to critical data. While many studies addressed LM management, keeping hot data in the LM continues to cause major headache. This paper revisits LM management of arrays in light of recent progresses in register allocation, supporting multiple live-range splitting schemes through a generic integer linear program. These schemes differ in the grain of decision points. The model can also be extended to address fragmentation, assigning live ranges to precise offsets. We show that the links between LM management and register allocation have been underexploited, leaving much fundamental questions open and effective applications to be explored.