Efficient implementation of the smalltalk-80 system
POPL '84 Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Proceedings of the 2002 international symposium on Low power electronics and design
WOSP '02 Proceedings of the 3rd international workshop on Software and performance
Dynamic binary translation for accumulator-oriented architectures
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Dynamic trace selection using performance monitoring hardware sampling
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Dynamic native optimization of interpreters
Proceedings of the 2003 workshop on Interpreters, virtual machines and emulators
The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Performance of Runtime Optimization on BLAST
Proceedings of the international symposium on Code generation and optimization
Dynamic run-time architecture techniques for enabling continuous optimization
Proceedings of the 2nd conference on Computing frontiers
A programmable microkernel for real-time systems
Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
Design and evaluation of dynamic optimizations for a Java just-in-time compiler
ACM Transactions on Programming Languages and Systems (TOPLAS)
Evolution of a java just-in-time compiler for IA-32 platforms
IBM Journal of Research and Development
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Power reduction techniques for microprocessor systems
ACM Computing Surveys (CSUR)
Online performance auditing: using hot optimizations without getting burned
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
A comparison of online and offline strategies for program adaptation
ACM-SE 45 Proceedings of the 45th annual southeast regional conference
Mostly static program partitioning of binary executables
ACM Transactions on Programming Languages and Systems (TOPLAS)
Generating low-overhead dynamic binary translators
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Automatically analyzing software processes: experience report
SPW'05 Proceedings of the 2005 international conference on Unifying the Software Process Spectrum
Hi-index | 14.98 |
This paper presents a system in which the already executing user code is continually and automatically reoptimized in the background, using dynamically collected execution profiles as a guide. Whenever a new code image has been constructed in the background in this manner, it is hot-swapped in place of the previously executing one. Control is then transferred to the new code and construction of yet another code image is initiated in the background. Two new runtime optimization techniques have been implemented in the context of this system: object layout adaptation and dynamic trace scheduling. The former technique constantly improves the storage layout of dynamically allocated data structures to improve data cache locality. The latter increases the instruction-level parallelism by continually adapting the instruction schedule to predominantly executed program paths. The empirical results presented in this paper make a case in favor of continuous optimization, but also indicate some of the pitfalls and current shortcomings of continuous optimization. If not applied judiciously, the costs of dynamic optimizations outweigh their benefit in many situations so that no break-even point is ever reached. In favorable circumstances, however, speed-ups of over 96 percent have been observed. It appears as if the main beneficiaries of continuous optimization are shared libraries in specific application domains which, at different times, can be optimized in the context of the currently dominant client application.