The high performance Fortran handbook
The high performance Fortran handbook
Evaluation of design alternatives for a multiprocessor microprocessor
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Memory exploration for low power, embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
Automatic and efficient evaluation of memory hierarchies for embedded systems
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
An optimal memory allocation for application-specific multiprocessor system-on-chip
Proceedings of the 14th international symposium on Systems synthesis
Automatic generation of embedded memory wrapper for multiprocessor SoC
Proceedings of the 39th annual Design Automation Conference
Compiler-directed scratch pad memory hierarchy design and management
Proceedings of the 39th annual Design Automation Conference
Loop Parallelization
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Tuning of loop cache architectures to programs in embedded system design
Proceedings of the 15th international symposium on System Synthesis
Data memory design considering effective bitwidth for low-energy embedded systems
Proceedings of the 15th international symposium on System Synthesis
Finding Legal Reordering Transformations Using Mappings
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
An energy-conscious algorithm for memory port allocation
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Xtream-Fit: an energy-delay efficient data memory subsystem for embedded media processing
Proceedings of the 40th annual Design Automation Conference
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Polynomial-time algorithm for on-chip scratchpad memory partitioning
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Dynamic Partitioning of Shared Cache Memory
The Journal of Supercomputing
Organizing the Last Line of Defense before Hitting the Memory Wall for CMPs
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Hardware/software co-synthesis with memory hierarchies
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Selective code/data migration for reducing communication energy in embedded MpSoC architectures
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Dynamic partitioning of processing and memory resources in embedded MPSoC architectures
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Multi-Level On-Chip Memory Hierarchy Design for Embedded Chip Multiprocessors
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Scratchpad allocation for concurrent embedded software
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Scratchpad allocation for concurrent embedded software
ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic and adaptive SPM management for a multi-task environment
Journal of Systems Architecture: the EUROMICRO Journal
Static bus schedule aware scratchpad allocation in multiprocessors
Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Hi-index | 0.00 |
One of the most important issues in designing a chip multiprocessor is to decide its on-chip memory organization. A poor on-chip memory design can have serious power and performance implications when running data-intensive embedded applications. While it is possible to design an application-specific memory architecture, this may not be the best option, in particular when storage demands of individual processors and/or their data sharing patterns can change from one point in execution to another for the same application. In this paper, we consider dynamic configuration of software-managed on-chip memory space to adapt runtime variations in data storage demand and interprocessor sharing patterns. The proposed framework is fully implemented using an optimizing compiler, a polyhedral tool, and a memory partitioner (based on integer linear programming), and tested using a suite of eight data-intensive embedded applications. Our experimental evaluation indicates that the proposed technique is very effective in practice and leads to much less energy consumption than all the alternate memory management schemes tested, including one that comes up with an application-specific memory.