Instruction cache locking using temporal reuse profile

Authors:
Yun Liang;Tulika Mitra
Affiliations:
National University of Singapore;National University of Singapore
Venue:
Proceedings of the 47th Design Automation Conference
Year:
2010

Citing 10
Cited 6

Procedure placement using temporal-ordering information

ACM Transactions on Programming Languages and Systems (TOPLAS)
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Data cache locking for higher program predictability

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Low-Complexity Algorithms for Static Cache Locking in Multitasking Hard Real-Time Systems

RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
An API for Runtime Code Patching

International Journal of High Performance Computing Applications
Improving power efficiency with compiler-assisted cache replacement

Journal of Embedded Computing - Cache exploitation in embedded systems
Compile-time decided instruction cache locking using worst-case execution paths

CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Exploring locking & partitioning for predictable shared caches on multi-cores

Proceedings of the 45th annual Design Automation Conference
Instruction cache locking inside a binary rewriter

CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems

Improved procedure placement for set associative caches

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
An algorithm for deciding minimal cache sizes in real-time systems

Proceedings of the 13th annual conference on Genetic and evolutionary computation
WCET-centric partial instruction cache locking

Proceedings of the 49th Annual Design Automation Conference
Instruction Cache Locking for Embedded Systems using Probability Profile

Journal of Signal Processing Systems
Timing analysis of concurrent programs running on shared cache multi-cores

Real-Time Systems
An analytical approach for fast and accurate design space exploration of instruction caches

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the performance of an application. Modern embedded processors often feature cache locking mechanisms that allow memory blocks to be locked in the cache under software control. Cache locking was primarily designed to offer timing predictability for hard real-time applications. Hence, the compiler optimization techniques focus on employing cache locking to improve worst-case execution time. However, cache locking can be quite effective in improving the average-case execution time of general embedded applications as well. In this paper, we explore static instruction cache locking to improve average-case program performance. We introduce temporal reuse profile to accurately and efficiently model the cost and benefit of locking memory blocks in the cache. We propose an optimal algorithm and a heuristic approach that use the temporal reuse profile to determine the most beneficial memory blocks to be locked in the cache. Experimental results show that locking heuristic achieves close to optimal results and can improve the cache miss rate by up to 24% across a suite of real-world benchmarks. Moreover, our heuristic provides significant improvement compared to the state-of-the-art locking algorithm both in terms of performance and efficiency.