Compiler-decided dynamic memory allocation for scratch-pad based embedded systems

Authors:
Sumesh Udayakumaran;Rajeev Barua
Affiliations:
University of Maryland, College Park, MD;University of Maryland, College Park, MD
Venue:
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Year:
2003

Citing 15
Cited 82

Fine-grain access control for distributed shared memory

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Software caching and computation migration in Olden

Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
Power analysis and minimization techniques for embedded DSP software

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Adaptive software cache management for distributed shared memory architectures

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A fully associative software-managed cache design

Proceedings of the 27th annual international symposium on Computer architecture
Dynamic management of scratch-pad memory space

Proceedings of the 38th annual Design Automation Conference
Storage allocation for embedded processors

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Heterogeneous memory management for embedded systems

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Modern Compiler Implementation in C

Modern Compiler Implementation in C
Cool-cache for hot multimedia

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
An optimal memory allocation scheme for scratch-pad-based embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
Application-level document caching in the Internet

SDNE '95 Proceedings of the 2nd International Workshop on Services in Distributed and Networked Environments
Software Caching using Dynamic Binary Rewriting for Embedded Devices

ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing

Data compression for improving SPM behavior

Proceedings of the 41st annual Design Automation Conference
EMBARC: an efficient memory bank assignment algorithm for retargetable compilers

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Dynamic on-chip memory management for chip multiprocessors

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Cluster miss prediction with prefetch on miss for embedded CPU instruction caches

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
A post-compiler approach to scratchpad mapping of code

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Memory overflow protection for embedded systems using run-time checks, reuse and compression

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler Managed Dynamic Instruction Placement in a Low-Power Code Cache

Proceedings of the international symposium on Code generation and optimization
FORAY-GEN: Automatic Generation of Affine Functions for Memory Optimizations

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Shangri-La: achieving high performance from compiled network applications while enabling ease of programming

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Compiling for memory emergency

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Memory allocation for embedded systems with a compile-time-unknown scratch-pad size

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Virtual multiprocessor: an analyzable, high-performance architecture for real-time computing

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Variable-Based Multi-module Data Caches for Clustered VLIW Processors

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Memory Coloring: A Compiler Approach for Scratchpad Memory Management

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Hardware/software managed scratchpad memory for embedded system

Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
A novel instruction scratchpad memory optimization method based on concomitance metric

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Customized on-chip memories for embedded chip multiprocessors

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
An integrated scratch-pad allocator for affine and non-affine code

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies

Proceedings of the 43rd annual Design Automation Conference
Dynamic allocation for scratch-pad memory using compile-time decisions

ACM Transactions on Embedded Computing Systems (TECS)
Multi-Level On-Chip Memory Hierarchy Design for Embedded Chip Multiprocessors

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Minimizing bank selection instructions for partitioned memory architecture

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
A dynamic code placement technique for scratchpad memory using postpass optimization

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Scratchpad memory management for portable systems with a memory management unit

EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Memory overflow protection for embedded systems using run-time checks, reuse, and compression

ACM Transactions on Embedded Computing Systems (TECS)
DRDU: A data reuse analysis technique for efficient scratch-pad memory management

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Heap data allocation to scratch-pad memory in embedded systems

Journal of Embedded Computing - Cache exploitation in embedded systems
Dynamic data scratchpad memory management for a memory subsystem with an MMU

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Scratchpad allocation for data aggregates in superperfect graphs

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison

Proceedings of the conference on Design, automation and test in Europe
Reducing off-chip memory access costs using data recomputation in embedded chip multi-processors

Proceedings of the 44th annual Design Automation Conference
Recursive function data allocation to scratch-pad memory

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Scratch-pad memory allocation without compiler support for java applications

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Software controlled memory layout reorganization for irregular array access patterns

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Buffer and Register Allocation for Memory Space Optimization

Journal of VLSI Signal Processing Systems
Dynamic scratchpad memory management for code in portable systems with an MMU

ACM Transactions on Embedded Computing Systems (TECS)
Minimal placement of bank selection instructions for partitioned memory architectures

ACM Transactions on Embedded Computing Systems (TECS)
Optimization of memory system in real-time embedded systems

ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
Enabling run-time memory data transfer optimizations at the system level with automated extraction of embedded software metadata information

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Compiler driven data layout optimization for regular/irregular array access patterns

Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Using FORAY models to enable MPSoC memory optimizations

International Journal of Parallel Programming - Special Issue on Multiprocessor-based embedded systems
Access pattern-based code compression for memory-constrained systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scratchpad memory management in a multitasking environment

EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
Efficient vectorization of SIMD programs with non-aligned and irregular data access hardware

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Scratchpad allocation for concurrent embedded software

CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Direct address translation for virtual memory in energy-efficient embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Memory allocation for embedded systems with a compile-time-unknown scratch-pad size

ACM Transactions on Embedded Computing Systems (TECS)
A software solution for dynamic stack management on scratch pad memory

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Compiler-Assisted Memory Encryption for Embedded Processors

Transactions on High-Performance Embedded Architectures and Compilers II
Hardware-compiler co-design for adjustable data power savings

Microprocessors & Microsystems
Compiler-Based Performance Evaluation of an SIMD Processor with a Multi-Bank Memory Unit

Journal of Signal Processing Systems
Combining data reuse with data-level parallelization for FPGA-targeted hardware compilation: a geometric programming framework

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Adaptive scratch pad memory management for dynamic behavior of multimedia applications

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Direct memory access usage optimization in network applications for reduced memory latency and energy consumption

Journal of Embedded Computing - PATMOS 2007 selected papers on low power electronics
Runtime resource allocation in multi-core packet processing systems

HPSR'09 Proceedings of the 15th international conference on High Performance Switching and Routing
A software-only solution to use scratch pads for stack data

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Scratchpad allocation for concurrent embedded software

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiling Python to a hybrid execution environment

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
A hardware/software framework for instruction and data scratchpad memory allocation

ACM Transactions on Architecture and Code Optimization (TACO)
Virtual registers: reducing register pressure without enlarging the register file

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Improving scratchpad allocation with demand-driven data tiling

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Optimal WCET-aware code selection for scratchpad memory

EMSOFT '10 Proceedings of the tenth ACM international conference on Embedded software
Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs

ACM Transactions on Embedded Computing Systems (TECS)
Dynamic and adaptive SPM management for a multi-task environment

Journal of Systems Architecture: the EUROMICRO Journal
Algorithms for optimally arranging multicore memory structures

EURASIP Journal on Embedded Systems
Exploiting statistical information for implementation of instruction scratchpad memory in embedded system

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Static bus schedule aware scratchpad allocation in multiprocessors

Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
An instruction-scheduling-aware data partitioning technique for coarse-grained reconfigurable architectures

Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
A reuse-aware prefetching scheme for scratchpad memory

Proceedings of the 48th Design Automation Conference
Reducing memory space consumption through dataflow analysis

Computer Languages, Systems and Structures
DynaPoMP: dynamic policy-driven memory protection for SPM-based embedded systems

WESS '11 Proceedings of the Workshop on Embedded Systems Security
Instruction cache locking for multi-task real-time embedded systems

Real-Time Systems
Optimizing local memory allocation and assignment through a decoupled approach

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
WCET-aware data selection and allocation for scratchpad memory

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Instruction Cache Locking for Embedded Systems using Probability Profile

Journal of Signal Processing Systems
Direct memory access optimization in wireless terminals for reduced memory latency and energy consumption

PATMOS'07 Proceedings of the 17th international conference on Integrated Circuit and System Design: power and timing modeling, optimization and simulation
Towards data tiling for whole programs in scratchpad memory allocation

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
FCC-SDP: a fast close-coupled shared data pool for multi-core DSPs

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
A decoupled local memory allocator

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Write activity reduction on non-volatile main memories for embedded chip multiprocessors

ACM Transactions on Embedded Computing Systems (TECS)
Optimizing Data Placement of Loops for Energy Minimization with Multiple Types of Memories

Journal of Signal Processing Systems
Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in energy consumption, area and overall runtime, even with a simple allocation scheme [4].Existing scratch-pad allocation methods are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the software-maintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption and SRAM space for tags and deliver poor real-time guarantees just like hardware caches. A second category of algorithms partitionsm variables at compile-time into the two banks. For example, our previous work in [3] derives a provably optimal static allocation for global and stack variables and achieves a speedup over all earlier methods. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. It is easy to see why a data allocation that never changes at runtime cannot achieve the full locality benefits of a cache.In this paper we present a dynamic allocation method for global and stack data that for the first time, (i) accounts for changing program requirements at runtime (ii) has no software-caching tags (iii) requires no run-time checks (iv) has extremely low overheads, and (v) yields 100% predictable memory access times. In this method data that is about to be accessed frequently is copied into the SRAM using compiler-inserted code at fixed and infrequent points in the program. Earlier data is evicted if necessary. When compared to a provably optimal static allocation our results show runtime reductions ranging from 11% to 38%, averaging 31.2%, using no additional hardware support. With hardware support for pseudo-DMA and full DMA, which is already provided in some commercial systems, the runtime reductions increase to 33.4% and 34.2% respectively.