Data-centric multi-level blocking
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Orienting graphs to optimize reachability
Information Processing Letters
Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Automatic storage optimization
SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Optimistic register coalescing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic overlay of scratchpad memory for energy minimization
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
On combining iteration space tiling with data space tiling for scratch-pad memory systems
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Dynamic allocation for scratch-pad memory using compile-time decisions
ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad allocation for data aggregates in superperfect graphs
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Minimal comparability completions of arbitrary graphs
Discrete Applied Mathematics
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Entering the petaflop era: the architecture and performance of Roadrunner
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler-directed scratchpad memory management via graph coloring
ACM Transactions on Architecture and Code Optimization (TACO)
Towards data tiling for whole programs in scratchpad memory allocation
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
A semi-automatic scratchpad memory management framework for CMP
APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
SPM-Sieve: a framework for assisting data partitioning in scratch pad memory based systems
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
Existing scratchpad memory (SPM) allocation algorithms for arrays, whether they rely on well-crafted heuristics or resort to integer linear programming (ILP) techniques, typically assume that every array is small enough to fit directly into the SPM. As a result, some arrays have to be spilled entirely to the off-chip memory in order to make room for other arrays to stay in the SPM, resulting in sometimes poor SPM utilization. In this paper, we introduce a new comparability graph coloring allocator that integrates for the first time data tiling and SPM allocation for arrays by tiling arrays on-demand to improve utilization of the SPM. The novelty lies in repeatedly identifying the heaviest path in an array interference graph and then reducing its weight by tiling certain arrays on the path appropriately with respect to the size of the SPM. The effectiveness of our allocator, which is presently restricted to tiling 1-D arrays, is validated by using a number of selected benchmarks for which existing allocators are ineffective.