Introduction to algorithms
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Efficient procedure mapping using cache line coloring
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Procedure placement using temporal ordering information
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Linear and Time Minimum-Cost Matching Algorithms for Quasi-Convex Tours
SIAM Journal on Computing
Efficiency of a Good But Not Linear Set Union Algorithm
Journal of the ACM (JACM)
Procedure placement using temporal-ordering information
ACM Transactions on Programming Languages and Systems (TOPLAS)
Lx: a technology platform for customizable VLIW embedded processing
Proceedings of the 27th annual international symposium on Computer architecture
Temporal-Based Procedure Reordering for Improved Instruction Cache Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Optimal task placement to improve cache performance
EMSOFT '07 Proceedings of the 7th ACM & IEEE international conference on Embedded software
Techniques and tools for implementing IEEE 754 floating-point arithmetic on VLIW integer processors
Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
Joint task assignment and cache partitioning with cache locking for WCET minimization on MPSoC
Journal of Parallel and Distributed Computing
WCET-driven branch prediction aware code positioning
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Instruction cache locking for multi-task real-time embedded systems
Real-Time Systems
WCET-centric partial instruction cache locking
Proceedings of the 49th Annual Design Automation Conference
Instruction Cache Locking for Embedded Systems using Probability Profile
Journal of Signal Processing Systems
Hi-index | 0.00 |
In a direct-mapped instruction cache, all instructions that have the same memory address modulo the cache size share a common and unique cache slot. Instruction cache conflicts can be partially handled at linked time by procedure placement. Pettis and Hansen give in [1] an algorithm that reorders procedures in memory by aggregating them in a greedy fashion. The Gloy and Smith algorithm [2] greatly decreases the number of conflict-misses but increases the code size by allowing gaps between procedures. The latter contains two main stages: the cache-placement phase assigns modulo addresses to minimizes cache-conflicts; the memory-placement phase assigns final memory addresses under the modulo placement constraints, and minimizes the code size expansion. In this paper: (1) we prove the NP-completeness of the cache-placement problem; (2) we provide an optimal algorithm to the memory-placement problem with complexity O(nmin(n,L)ack(n)) (n is the number of procedures, L the cache size, α is the inverse Ackermann's function that is lower than 4 in practice); (3) we take final program size into consideration during the cache-placement phase. Our modifications to the Gloy and Smith algorithm gives on average a code size expansion of 8% over the original program size, while the initial algorithm gave an expansion of 177%. The cache miss reduction is nearly the same as the Gloy and Smith solution with 35% cache miss reduction.