Compile time instruction cache optimizations
ACM SIGARCH Computer Architecture News - Special issue: panel sessions of the 1991 workshop on multithreaded computers
Optimization of instruction fetch mechanisms for high issue rates
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Performance issues in correlated branch prediction schemes
Proceedings of the 28th annual international symposium on Microarchitecture
Augmenting Loop Tiling with Data Alignment for Improved Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
A hardware mechanism for dynamic extraction and relayout of program hot spots
Proceedings of the 27th annual international symposium on Computer architecture
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Code placement for improving dynamic branch prediction accuracy
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The Camino Compiler infrastructure
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior
Proceedings of the 5th international symposium on Memory management
Dynamic code management: improving whole program code locality in managed runtimes
Proceedings of the 2nd international conference on Virtual execution environments
Improving instruction locality with just-in-time code layout
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Dynamic Cache Placement with Two-level Mapping to Reduce Conflict Misses
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Reducing cache misses through programmable decoders
ACM Transactions on Architecture and Code Optimization (TACO)
Fast indexing for blocked array layouts to reduce cache misses
International Journal of High Performance Computing and Networking
Dynamic round-robin task scheduling to reduce cache misses for embedded systems
Proceedings of the conference on Design, automation and test in Europe
An Evaluation of Misaligned Data Access Handling Mechanisms in Dynamic Binary Translation Systems
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Code alignment for architectures with pipeline group dispatching
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
MAO -- An extensible micro-architectural optimizer
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
Static alignment techniques are well studied and have been incorporated into compilers in order to optimize code locality for the instruction fetch unit in modern processors. However, current static alignment techniques have several limitations that cannot be overcome. In the exascale era, it becomes even more important to break from static techniques and develop adaptive algorithms in order to maximize the utilization of every processor cycle. In this paper, we explore those limitations and show that reactive realignment, a method where we dynamically monitor running applications, react to symptoms of poor alignment, and adapt alignment to the current execution environment and program input, is more scalable than static alignment. We present fetches-per-instruction as a runtime indicator of poor alignment. Additionally, we discuss three main opportunities that static alignment techniques cannot leverage, but which are increasingly important in large scale computing systems: microarchitectural differences of cores, dynamic program inputs that exercise different and sometimes alternating code paths, and dynamic branch behavior, including indirect branch behavior and phase changes. Finally, we will present several instances where our trigger for reactive realignment may be incorporated in practice, and discuss the limitations of dynamic alignment.