The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
Scratchpad memory management for portable systems with a memory management unit
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Dynamic data scratchpad memory management for a memory subsystem with an MMU
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Efficient dynamic heap allocation of scratch-pad memory
Proceedings of the 7th international symposium on Memory management
Scratchpad memory management in a multitasking environment
EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
COMIC: a coherent shared memory interface for cell be
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Compiler-directed scratchpad memory management via graph coloring
ACM Transactions on Architecture and Code Optimization (TACO)
A Novel Adaptive Scratchpad Memory Management Strategy
RTCSA '09 Proceedings of the 2009 15th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
Adaptive scratch pad memory management for dynamic behavior of multimedia applications
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Partitioning and allocation of scratch-pad memory for priority-based preemptive multi-task systems
Proceedings of the Conference on Design, Automation and Test in Europe
Improving scratchpad allocation with demand-driven data tiling
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Heap data management for limited local memory (LLM) multi-core processors
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs
ACM Transactions on Embedded Computing Systems (TECS)
Processor and System-on-Chip Simulation
Processor and System-on-Chip Simulation
Dynamic and adaptive SPM management for a multi-task environment
Journal of Systems Architecture: the EUROMICRO Journal
An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs
IEEE Transactions on Computers
Hi-index | 0.00 |
Previous research has demonstrated that scratchpad memory(SPM) consumes far less power and on-chip area than the traditional cache. As a software managed memory, SPM has been widely adopted in today's mainstream embedded processors. Traditional SPM allocation strategies depend on either the compiler or the programmer to manage the small memory. The former methods predict the frequently referenced data items before real running by static analysis or profiling, whereas the latter methods require the programmer to manually allocate the SPM space. As for the dynamic heap data allocation, there is no mature allocation scheme for multicore processors with a shared software-managed on-chip memory. This paper presents a novel SPM management framework, for chip multiprocessors (CMP) featuring partitioned global address space (PGAS) SPM memory architecture. The most frequently referenced heap data are maintained in the SPM. This framework mitigates the SPM allocation problem by leveraging the programmer's hints to determine the data items allocated to the SPM. The complex and error-prone allocation procedure is completely handled by an SPM management library (SPMMLIB) without programmer's conscious. The performance is evaluated in a homogenous UltraSPARC multiprocessor using PARSEC and SPLASH2 benchmarks. Experimental results indicate that, on average, the energy consumption is reduced by 22.4% compared with the cache memory architecture.