Cache-tuning-aware scratchpad allocation from binaries
Proceedings of the 24th symposium on Integrated circuits and systems design
A reuse-aware prefetching scheme for scratchpad memory
Proceedings of the 48th Design Automation Conference
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Embedded RAIDs-on-chip for bus-based chip-multiprocessors
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 14.98 |
We propose a code scratchpad memory (SPM) management technique with demand paging for embedded systems that have no memory management unit. Based on profiling information, a postpass optimizer analyzes and optimizes application binaries in a fully automated process. It classifies the code of the application including libraries into three classes based on a mixed integer linear programming formulation: External code is executed directly from the external memory. Pinned code is loaded into the SPM when the application starts and stays in the SPM. Paged code is loaded into/unloaded from the SPM on demand. We evaluate the proposed technique by running 14 embedded applications both on a cycle-accurate ARM processor simulator and an ARM1136JF-S core. On the simulator, the reference case, a four-way set-associative cache, is compared to a direct-mapped cache and an SPM of comparable die area. On average, we observe an improvement of 12 percent in runtime performance and a 21 percent reduction in energy consumption. On the ARM11 board, the reference case run on the 16-KB four-way set-associative cache is compared to the demand paging solution on the 16-KB SPM, optionally supported by the cache. The measured results show both a runtime performance improvement and a reduction of the energy consumption by 23 percent, on average.