Architectural leakage-aware management of partitioned scratchpad memories
Proceedings of the conference on Design, automation and test in Europe
SCOPES '07 Proceedingsof the 10th international workshop on Software & compilers for embedded systems
Locality-driven architectural cache sub-banking for leakage energy reduction
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Architectural support for shadow memory in multiprocessors
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Runtime monitoring on multicores via OASES
ACM SIGOPS Operating Systems Review
A hardware/software framework for instruction and data scratchpad memory allocation
ACM Transactions on Architecture and Code Optimization (TACO)
Partitioning and allocation of scratch-pad memory for priority-based preemptive multi-task systems
Proceedings of the Conference on Design, Automation and Test in Europe
Efficient OpenMP support and extensions for MPSoCs with explicitly managed memory hierarchy
Proceedings of the Conference on Design, Automation and Test in Europe
Run-time reconfiguration of expandable cache for embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.03 |
Focusing on embedded applications, scratchpad memories (SPMs) look like a best-compromise solution when taking into account performance, energy consumption, and die area. The main challenge in SPM design is to optimally map memory locations to scratchpad locations. This paper describes an algorithm to solve such a mapping problem by means of dynamic programming applied to a synthesizable hardware architecture. The algorithm works by mapping segments of external memory to physically partitioned banks of an on-chip SPM; this architecture provides significant energy savings. The algorithm does not require any user-set bound on the number of partitions and takes into account partitioning overhead. Improving on previous solutions, execution time is polynomial in the number of memory locations, even in the most general solving policy. This has the major practical advantage of allowing an arbitrary number of scratchpad segments, something that was impossible with previous methods, whose running time is exponential to this number. Strategies to optimize memory requirements and speed of the algorithm are exploited. Additionally, we integrate this algorithm in a complete and automated design, simulation, and synthesis flow.