Procedure placement using temporal-ordering information
ACM Transactions on Programming Languages and Systems (TOPLAS)
Data cache locking for higher program predictability
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Low-Complexity Algorithms for Static Cache Locking in Multitasking Hard Real-Time Systems
RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
Improving power efficiency with compiler-assisted cache replacement
Journal of Embedded Computing - Cache exploitation in embedded systems
Compile-time decided instruction cache locking using worst-case execution paths
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Exploring locking & partitioning for predictable shared caches on multi-cores
Proceedings of the 45th annual Design Automation Conference
Instruction cache locking inside a binary rewriter
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Improved procedure placement for set associative caches
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
An algorithm for deciding minimal cache sizes in real-time systems
Proceedings of the 13th annual conference on Genetic and evolutionary computation
WCET-centric partial instruction cache locking
Proceedings of the 49th Annual Design Automation Conference
Instruction Cache Locking for Embedded Systems using Probability Profile
Journal of Signal Processing Systems
An analytical approach for fast and accurate design space exploration of instruction caches
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the performance of an application. Modern embedded processors often feature cache locking mechanisms that allow memory blocks to be locked in the cache under software control. Cache locking was primarily designed to offer timing predictability for hard real-time applications. Hence, the compiler optimization techniques focus on employing cache locking to improve worst-case execution time. However, cache locking can be quite effective in improving the average-case execution time of general embedded applications as well. In this paper, we explore static instruction cache locking to improve average-case program performance. We introduce temporal reuse profile to accurately and efficiently model the cost and benefit of locking memory blocks in the cache. We propose an optimal algorithm and a heuristic approach that use the temporal reuse profile to determine the most beneficial memory blocks to be locked in the cache. Experimental results show that locking heuristic achieves close to optimal results and can improve the cache miss rate by up to 24% across a suite of real-world benchmarks. Moreover, our heuristic provides significant improvement compared to the state-of-the-art locking algorithm both in terms of performance and efficiency.