Recursive function data allocation to scratch-pad memory

Authors:
Angel Dominguez;Nghi Nguyen;Rajeev K. Barua
Affiliations:
University of Maryland;University of Maryland;University of Maryland
Venue:
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Year:
2007

Citing 25
Cited 6

Efficient compilation of linear recursive functions into object level loops

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
On the conversion of indirect to direct recursion

ACM Letters on Programming Languages and Systems (LOPLAS)
Power analysis and minimization techniques for embedded DSP software

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proper tail recursion and space efficiency

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
From recursion to iteration: what are the optimizations?

PEPM '00 Proceedings of the 2000 ACM SIGPLAN workshop on Partial evaluation and semantics-based program manipulation
A fully associative software-managed cache design

Proceedings of the 27th annual international symposium on Computer architecture
On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Notes on recursion elimination

Communications of the ACM
JouleTrack: a web based tool for software energy profiling

Proceedings of the 38th annual Design Automation Conference
Dynamic management of scratch-pad memory space

Proceedings of the 38th annual Design Automation Conference
Exploiting scratch-pad memory using Presburger formulas

Proceedings of the 14th international symposium on Systems synthesis
Storage allocation for embedded processors

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Heterogeneous memory management for embedded systems

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
An optimal memory allocation scheme for scratch-pad-based embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
Assigning Program and Data Objects to Scratchpad for Energy Reduction

Proceedings of the conference on Design, automation and test in Europe
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Cache-Aware Scratchpad Allocation Algorithm

Proceedings of the conference on Design, automation and test in Europe - Volume 2
EMBARC: an efficient memory bank assignment algorithm for retargetable compilers

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Dynamic overlay of scratchpad memory for energy minimization

Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A post-compiler approach to scratchpad mapping of code

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler-optimized usage of partitioned memories

WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Dynamic allocation for scratch-pad memory using compile-time decisions

ACM Transactions on Embedded Computing Systems (TECS)
Heap data allocation to scratch-pad memory in embedded systems

Journal of Embedded Computing - Cache exploitation in embedded systems

A software solution for dynamic stack management on scratch pad memory

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
A software-only solution to use scratch pads for stack data

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Algorithm for writing efficient recursive functions in assembly languages

Journal of Computing Sciences in Colleges
An instruction-scheduling-aware data partitioning technique for coarse-grained reconfigurable architectures

Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Optimizing local memory allocation and assignment through a decoupled approach

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
A decoupled local memory allocator

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its significantly lower access time, energy consumption, real-time bounds, area and overall runtime. Existing compiler methods for allocating data to scratch-pad are able to place only code, global, heap and non-recursive stack data in scratch-pad memory; stack data for recursive functions is allocated entirely in DRAM, resulting in poor performance. In this paper we present a dynamic yet compiler-directed allocation method for recursive function stack data that for the first time, is able to place a portion of recursive stack data in scratch-pad. It has almost no software-caching overhead, and is able to move recursive function data back and forth between scratch-pad and DRAM to better track the program's locality characteristics. With our method, all code, global, stack and heap variables can share the same scratch-pad. When compared to placing all recursive function data in DRAM and all other variables in scratch-pad, our results show that our method reduces the average runtime of our benchmarks by 29.3%, and the average power consumption by 31.1%, for the same size of scratch-pad fixed at5% of total data size. Furthermore,significant savings were observedwhen comparing our method against cache-based alternatives for SPM allocation. Finally, we show results that analyze the effects of profile variation on our allocation approach and present a modified version of our method which minimizes variation for profile-based allocations.