Improving scratchpad allocation with demand-driven data tiling

Authors:
Xuejun Yang;Li Wang;Jingling Xue;Tao Tang;Xiaoguang Ren;Sen Ye
Affiliations:
National University of Defense Technology, Changsha, China;National University of Defense Technology, Changsha, China;University of New South Wales, Sydney, Australia;National University of Defense Technology, Changsha, China;National University of Defense Technology, Changsha, China;University of New South Wales, Sydney, Australia
Venue:
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Year:
2010

Citing 21
Cited 2

Data-centric multi-level blocking

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Orienting graphs to optimize reachability

Information Processing Letters
Dynamic management of scratch-pad memory space

Proceedings of the 38th annual Design Automation Conference
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
Automatic storage optimization

SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)

Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Optimistic register coalescing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic overlay of scratchpad memory for energy minimization

Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Merrimac: Supercomputing with Streams

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Memory Coloring: A Compiler Approach for Scratchpad Memory Management

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
On combining iteration space tiling with data space tiling for scratch-pad memory systems

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Dynamic allocation for scratch-pad memory using compile-time decisions

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad allocation for data aggregates in superperfect graphs

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Minimal comparability completions of arbitrary graphs

Discrete Applied Mathematics
GRAPE-DR: 2-Pflops massively-parallel computer with 512-core, 512-Gflops processor chips for scientific computing

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Entering the petaflop era: the architecture and performance of Roadrunner

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Comparability graph coloring for optimizing utilization of stream register files in stream processors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler-directed scratchpad memory management via graph coloring

ACM Transactions on Architecture and Code Optimization (TACO)
Towards data tiling for whole programs in scratchpad memory allocation

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture

A semi-automatic scratchpad memory management framework for CMP

APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
SPM-Sieve: a framework for assisting data partitioning in scratch pad memory based systems

Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing scratchpad memory (SPM) allocation algorithms for arrays, whether they rely on well-crafted heuristics or resort to integer linear programming (ILP) techniques, typically assume that every array is small enough to fit directly into the SPM. As a result, some arrays have to be spilled entirely to the off-chip memory in order to make room for other arrays to stay in the SPM, resulting in sometimes poor SPM utilization. In this paper, we introduce a new comparability graph coloring allocator that integrates for the first time data tiling and SPM allocation for arrays by tiling arrays on-demand to improve utilization of the SPM. The novelty lies in repeatedly identifying the heaviest path in an array interference graph and then reducing its weight by tiling certain arrays on the path appropriately with respect to the size of the SPM. The effectiveness of our allocator, which is presently restricted to tiling 1-D arrays, is validated by using a number of selected benchmarks for which existing allocators are ineffective.