SPM-Sieve: a framework for assisting data partitioning in scratch pad memory based systems

Authors:
Prasenjit Chakraborty;Preeti Ranjan Panda
Affiliations:
Intel Technology India Pvt. Ltd., Bangalore;Indian Institute of Technology Delhi, New Delhi
Venue:
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Year:
2013

Citing 30
Cited 0

Reconfigurable caches and their application to media processing

Proceedings of the 27th annual international symposium on Computer architecture
Data and memory optimization techniques for embedded systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
An optimal memory allocation scheme for scratch-pad-based embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications

EDTC '97 Proceedings of the 1997 European conference on Design and Test
EMBARC: an efficient memory bank assignment algorithm for retargetable compilers

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Dynamic overlay of scratchpad memory for energy minimization

Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Memory Coloring: A Compiler Approach for Scratchpad Memory Management

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
WCET Centric Data Allocation to Scratchpad Memory

RTSS '05 Proceedings of the 26th IEEE International Real-Time Systems Symposium
Hardware/software managed scratchpad memory for embedded system

Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
An integrated scratch-pad allocator for affine and non-affine code

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Dynamic scratch-pad memory management for irregular array access patterns

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Decomposing memory performance: data structures and phases

Proceedings of the 5th international symposium on Memory management
Orchestrating data transfer for the cell/B.E. processor

Proceedings of the 22nd annual international conference on Supercomputing
Efficient dynamic heap allocation of scratch-pad memory

Proceedings of the 7th international symposium on Memory management
Scratchpad allocation for concurrent embedded software

CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
SPM management using Markov chain based data access prediction

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
A software solution for dynamic stack management on scratch pad memory

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Trace-based Performance Analysis on Cell BE

ISPASS '08 Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
Improving scratchpad allocation with demand-driven data tiling

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Optimal WCET-aware code selection for scratchpad memory

EMSOFT '10 Proceedings of the tenth ACM international conference on Embedded software
Compilation of stream programs onto scratchpad memory based embedded multicore processors through retiming

Proceedings of the 48th Design Automation Conference
Vector class on limited local memory (LLM) multi-core processors

CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Pinpointing data locality problems using data-centric analysis

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Integrating software caches with scratch pad memory

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Runnemede: An architecture for Ubiquitous High-Performance Computing

HPCA '13 Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern system architectures sometimes include scratch pad memories (SPM) in their memory hierarchy to take advantage of their simpler design, in an attempt to meet the system area, performance, and power budget. These systems employing SPM can be broadly categorized as: (a) cacheless systems with only SPM, (b) hybrid systems with both cache and SPM, and (c) reconfigurable systems with the provision to reconfigure local memory as either cache, SPM, or a combination of the two. However SPM based systems have needed larger efforts spent on their programming, mainly due to allocating data and orchestrating data transfers explicitly by software. Tight product development cycles require faster development and porting of diverse applications to multiple SPM based architectures. In this paper we present SPM-Sieve, a profile-based tool and framework targeted for SPM based architectures that generates partitioning decisions of the first level memory in the system hierarchy, and suggests object mapping amongst the memory partitions without resorting to detailed simulation of all configurations. This is done by natively executing an application and using minimal target architecture specification, which not only provides early information influencing data organization in the application, but also provides a foundation for other more sophisticated algorithms to produce optimized allocations. We demonstrate the utility and generality of SPM-Sieve by evaluating it on a large number of SPEC2000 benchmarks targeted for a 128KB first level memory. We evaluate its effectiveness by performing simulation studies comparing the partition suggested by the tool against varying partition sizes, and observe that its suggestions are very competitive for SPM based architectures with and without caches.