Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Adaptive optimization in the Jalapeño JVM
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Software profiling for hot path prediction: less is more
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
An Empirical Study of Method In-lining for a Java Just-in-Time Compiler
Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Retargetable and reconfigurable software dynamic translation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Adaptive online context-sensitive inlining
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Dynamic profiling and trace cache generation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Overview of the IBM Java just-in-time compiler
IBM Systems Journal
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Runtime specialization with optimistic heap analysis
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Improving Region Selection in Dynamic Optimization Systems
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior
Proceedings of the 5th international symposium on Memory management
Evaluating fragment construction policies for SDT systems
Proceedings of the 2nd international conference on Virtual execution environments
HotpathVM: an effective JIT compiler for resource-constrained devices
Proceedings of the 2nd international conference on Virtual execution environments
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems
Proceedings of the International Symposium on Code Generation and Optimization
The java hotspotTM server compiler
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
The use of traces for inlining in java programs
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
A self-adjusting code cache manager to balance start-up time and memory usage
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Trace-based compilation in execution environments without interpreters
Proceedings of the 8th International Conference on the Principles and Practice of Programming in Java
SPUR: a trace-based JIT compiler for CIL
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Improving the performance of trace-based systems by false loop filtering
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Reducing trace selection footprint for large-scale Java applications without performance loss
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Trace construction using enhanced performance monitoring
Proceedings of the ACM International Conference on Computing Frontiers
Hi-index | 0.00 |
Java virtual machines have historically employed either a "wholemethod" or a "trace" methodology for selecting regions of code for optimization. Adaptive whole-method optimization primarily leverages intra-procedural optimizations derived from classic static compilation techniques whereas trace optimization utilizes an interpreter to select, manage, and dispatch inter-procedural fragments of frequently executed code. In this paper we present our hybrid approach for supplementing the comprehensive strengths of a whole-method JIT compiler with the inter-procedural refinement of trace fragment selection and show that that the two techniques would be mutually beneficial. Using the "interpreterless" Jikes RVM as a foundation, we use our trace profiling subsystem to identify an application's working set as a collection of hot traces and show that there is a significant margin for improvement in instruction ordering that can be addressed by trace execution. Our benchmark hot-trace profiles indicate that 20% of transitions between machine-code basic blocks as laid out by the JIT compiler are non-contiguous, many of which are transfers of control flow to locations outside of the current virtual memory page. Additionally, the analyses performed by the adaptive whole-method JIT compiler allow for better identification of trace starting and stopping locations, an improvement over the popular next-executing-tail (NET) trace selection scheme. We show minimal overhead for trace selection indicating that inter-procedural trace execution provides an opportunity to improve both instruction locality as well as compiler-directed branch prediction without significant run-time cost.