Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Achieving high instruction cache performance with an optimizing compiler
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Predicting program behavior using real or estimated profiles
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
An inter-reference gap model for temporal locality in program behavior
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Efficient procedure mapping using cache line coloring
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Procedure placement using temporal ordering information
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Compile Time Instruction Cache Optimizations
CC '94 Proceedings of the 5th International Conference on Compiler Construction
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Optimizing instruction cache performance for operating system intensive workloads
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Temporal-Based Procedure Reordering for Improved Instruction Cache Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
A proposal for input-sensitivity analysis of profile-driven optimizations on embedded applications
MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Optimizing instruction cache performance of embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
YAARC: yet another approach to further reducing the rate of conflict misses
The Journal of Supercomputing
Cache line reservation: exploring a scheme for cache-friendly object allocation
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Improved procedure placement for set associative caches
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Hi-index | 0.00 |
In this paper, we examine temporal-based program interaction in order to improve layout by reducing the probability that program units will conflict in an instruction cache. In that context, we present two profile-guided procedure reordering algorithms. Both techniques use cache line coloring to arrive at a final program layout and target the elimination of first generation cache conflicts (i.e., conflicts between caller/callee pairs). The first algorithm builds a call graph that records local temporal interaction between procedures. We will describe how the call graph is used to guide the placement step and present methods that accelerate cache line coloring by exploring aggressive graph pruning techniques. In the second approach, we capture global temporal program interaction by constructing a Conflict Miss Graph (CMG). The CMG estimates the worst-case number of misses two competing procedures can inflict upon one another and reducing higher generation cache conflicts. We use a pruned CMG graph to guide cache line coloring. Using several C and C++ benchmarks, we show the benefits of letting both types of graphs guide procedure reordering to improve instruction cache hit rates. To contrast the differences between these two forms of temporal interaction, we also develop new characterization streams based on the Inter-Reference Gap (IRG) model.