SDRM: simultaneous determination of regions and function-to-region mapping for scratchpad memories

Authors:
Amit Pabalkar;Aviral Shrivastava;Arun Kannan;Jongeun Lee
Affiliations:
Department of Computer Science and Engineering, Arizona State University, Tempe, AZ;Department of Computer Science and Engineering, Arizona State University, Tempe, AZ;Department of Computer Science and Engineering, Arizona State University, Tempe, AZ;Department of Computer Science and Engineering, Arizona State University, Tempe, AZ
Venue:
HiPC'08 Proceedings of the 15th international conference on High performance computing
Year:
2008

Citing 11
Cited 11

A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor

Digital Technical Journal
Reducing energy consumption by dynamic copying of instructions onto onchip memory

Proceedings of the 15th international symposium on System Synthesis
An optimal memory allocation scheme for scratch-pad-based embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Assigning Program and Data Objects to Scratchpad for Energy Reduction

Proceedings of the conference on Design, automation and test in Europe
A post-compiler approach to scratchpad mapping of code

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Dynamic allocation for scratch-pad memory using compile-time decisions

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad memory management for portable systems with a memory management unit

EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Overlay techniques for scratchpad memories in low power embedded processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

A software-only solution to use scratch pads for stack data

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fine-grain dynamic instruction placement for L0 scratch-pad memory

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
A performance model and code overlay generator for scratchpad enhanced embedded processors

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Heap data management for limited local memory (LLM) multi-core processors

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Branch penalty reduction on IBM cell SPUs via software branch hinting

CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Scheduling of synchronous data flow models on scratchpad memory based embedded processors

Proceedings of the International Conference on Computer-Aided Design
Automatic code overlay generation and partially redundant code fetch elimination

ACM Transactions on Architecture and Code Optimization (TACO)
An automatic code overlaying technique for multicores with explicitly-managed memory hierarchies

Proceedings of the Tenth International Symposium on Code Generation and Optimization
A software-only scheme for managing heap data on limited local memory(LLM) multicore processors

ACM Transactions on Embedded Computing Systems (TECS)
Scheduling of synchronous data flow models onto scratchpad memory-based embedded processors

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
CMSM: an efficient and effective code management for software managed multicores

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many programmable embedded systems feature low power processorscoupled with fast compiler controlled on-chip scratchpad memories (SPMs) toreduce their energy consumption. SPMs are more efficient than caches in termsof energy consumption, performance, area and timing predictability. However,unlike caches SPMs need explicit management by software, the quality ofwhich can impact the performance of SPM based systems. In this paper, wepresent a fully-automated, dynamic code overlaying technique for SPMs basedon pure static analysis. Static analysis is less restrictive than profiling and canbe easily extended to general compiler framework where the time consumingand expensive task of profiling may not be feasible. The SPM code mappingproblem is harder than bin packing problem, which is NP-complete. Therefore weformulate the SPMcode mapping as a binary integer linear programming problemand also propose a heuristic, determining simultaneously the region (bin) sizesas well as the function-to-region mapping. To the best of our knowledge, thisis the first heuristic which simultaneously solves the interdependent problemsof region size determination and the function-to-region mapping. We evaluateour approach for a set of MiBench applications on a horizontally split I-cache and SPM architecture (HSA). Compared to a cache-only architecture (COA),the HSA gives an average energy reduction of 35%, with minimal performancedegradation. For the HSA, we also compare the energy results from our proposedSDRM heuristic against a previous static analysis based mapping heuristic andobserve an average 27% energy reduction.