Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Branch folding in the CRISP microprocessor: reducing branch delay to zero
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The hardware architecture of the CRISP microprocessor
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
ACM Computing Surveys (CSUR)
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The design of a RISC based multiprocessor chip
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Compile time instruction cache optimizations
ACM SIGARCH Computer Architecture News - Special issue: panel sessions of the 1991 workshop on multithreaded computers
SPAID: software prefetching in pointer- and call-intensive environments
Proceedings of the 28th annual international symposium on Microarchitecture
Analysis of techniques to improve protocol processing latency
Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
Optimizing the Instruction Cache Performance of the Operating System
IEEE Transactions on Computers
Profile-directed restructuring of operating system code
IBM Systems Journal
Cluster miss prediction with prefetch on miss for embedded CPU instruction caches
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Reconciling real-time guarantees and energy efficiency through unlocked-cache prefetching
Proceedings of the 50th Annual Design Automation Conference
Hi-index | 0.00 |
In this paper we describe compiler techniques for improving instruction cache performance. Through repositioning of the code in main memory, leaving memory locations unused, code duplication, and code propagation, the effectiveness of the cache can be improved due to reduced cache pollution and fewer cache misses. Results of experiments indicate that significant reduction in bus traffic results from the use of these techniques. Since memory bandwidth is a critical resource in shared memory multiprocessors, such systems can benefit from the techniques described. The notion of control dependence is used to decide when instructions belonging to different basic blocks can be allowed to share the same cache line without increasing cache pollution.