Trace fragment selection within method-based JVMs

Authors:
Duane Merrill;Kim Hazelwood
Affiliations:
University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA
Venue:
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Year:
2008

Citing 22
Cited 7

Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Dynamo: a transparent dynamic optimization system

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Adaptive optimization in the Jalapeño JVM

OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Software profiling for hot path prediction: less is more

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
An Empirical Study of Method In-lining for a Java Just-in-Time Compiler

Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium
The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Retargetable and reconfigurable software dynamic translation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Adaptive online context-sensitive inlining

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An infrastructure for adaptive dynamic optimization

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Dynamic profiling and trace cache generation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Overview of the IBM Java just-in-time compiler

IBM Systems Journal
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The Jikes research virtual machine project: building an open-source research community

IBM Systems Journal
Runtime specialization with optimistic heap analysis

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Improving Region Selection in Dynamic Optimization Systems

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior

Proceedings of the 5th international symposium on Memory management
Evaluating fragment construction policies for SDT systems

Proceedings of the 2nd international conference on Virtual execution environments
HotpathVM: an effective JIT compiler for resource-constrained devices

Proceedings of the 2nd international conference on Virtual execution environments
The DaCapo benchmarks: java benchmarking development and analysis

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems

Proceedings of the International Symposium on Code Generation and Optimization
The java hotspotTM server compiler

JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
The use of traces for inlining in java programs

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing

A self-adjusting code cache manager to balance start-up time and memory usage

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Trace-based compilation in execution environments without interpreters

Proceedings of the 8th International Conference on the Principles and Practice of Programming in Java
SPUR: a trace-based JIT compiler for CIL

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Improving the performance of trace-based systems by false loop filtering

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Reducing trace selection footprint for large-scale Java applications without performance loss

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores

Proceedings of the Tenth International Symposium on Code Generation and Optimization
Trace construction using enhanced performance monitoring

Proceedings of the ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Java virtual machines have historically employed either a "wholemethod" or a "trace" methodology for selecting regions of code for optimization. Adaptive whole-method optimization primarily leverages intra-procedural optimizations derived from classic static compilation techniques whereas trace optimization utilizes an interpreter to select, manage, and dispatch inter-procedural fragments of frequently executed code. In this paper we present our hybrid approach for supplementing the comprehensive strengths of a whole-method JIT compiler with the inter-procedural refinement of trace fragment selection and show that that the two techniques would be mutually beneficial. Using the "interpreterless" Jikes RVM as a foundation, we use our trace profiling subsystem to identify an application's working set as a collection of hot traces and show that there is a significant margin for improvement in instruction ordering that can be addressed by trace execution. Our benchmark hot-trace profiles indicate that 20% of transitions between machine-code basic blocks as laid out by the JIT compiler are non-contiguous, many of which are transfers of control flow to locations outside of the current virtual memory page. Additionally, the analyses performed by the adaptive whole-method JIT compiler allow for better identification of trace starting and stopping locations, an improvement over the popular next-executing-tail (NET) trace selection scheme. We show minimal overhead for trace selection indicating that inter-procedural trace execution provides an opportunity to improve both instruction locality as well as compiler-directed branch prediction without significant run-time cost.