Trace fragment selection within method-based JVMs

  • Authors:
  • Duane Merrill;Kim Hazelwood

  • Affiliations:
  • University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA

  • Venue:
  • Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Java virtual machines have historically employed either a "wholemethod" or a "trace" methodology for selecting regions of code for optimization. Adaptive whole-method optimization primarily leverages intra-procedural optimizations derived from classic static compilation techniques whereas trace optimization utilizes an interpreter to select, manage, and dispatch inter-procedural fragments of frequently executed code. In this paper we present our hybrid approach for supplementing the comprehensive strengths of a whole-method JIT compiler with the inter-procedural refinement of trace fragment selection and show that that the two techniques would be mutually beneficial. Using the "interpreterless" Jikes RVM as a foundation, we use our trace profiling subsystem to identify an application's working set as a collection of hot traces and show that there is a significant margin for improvement in instruction ordering that can be addressed by trace execution. Our benchmark hot-trace profiles indicate that 20% of transitions between machine-code basic blocks as laid out by the JIT compiler are non-contiguous, many of which are transfers of control flow to locations outside of the current virtual memory page. Additionally, the analyses performed by the adaptive whole-method JIT compiler allow for better identification of trace starting and stopping locations, an improvement over the popular next-executing-tail (NET) trace selection scheme. We show minimal overhead for trace selection indicating that inter-procedural trace execution provides an opportunity to improve both instruction locality as well as compiler-directed branch prediction without significant run-time cost.