Local variable access behavior of a hardware-translation based Java virtual machine

  • Authors:
  • Hitoshi Oi

  • Affiliations:
  • Department of Computer Science, The University of Aizu, Aizu-Wakamatsu, Japan

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hardware bytecode translation is a technique to improve the performance of the Java virtual machine (JVM), especially on the portable devices for which the overhead of dynamic compilation is significant. However, since the translation is done on a single bytecode basis, a naive implementation of the JVM generates frequent memory accesses for local variables which can be not only a performance bottleneck but also an obstacle for instruction folding. A solution to this problem is to add a small register file to the data path of the microprocessor which is dedicated for storing local variables. However, the effectiveness of such a local variable register file depends on the size and the local variable access behavior of the applications. In this paper, we analyze the local variable access behavior of various Java applications. In particular, we will investigate the fraction of local variable accesses that are covered by the register file of a varying size, which determines the chip area overhead and the operation speed. We also evaluate the effectiveness of the sliding register window for parameter passing in context of JVM and on-the-fly optimization of local variable to register file mapping. With two types of exceptions, a 16-entry register file achieves coverages of up to 98%. The first type of exception is represented by the SAXON XSLT processor for which the effect of cold miss is significant. Adding the sliding window feature to the register file for parameter passing turns 6.2-13.3% of total accesses from miss to hit to the register file for the SAXON with XSLTMark. The second type of exception is represented by the FFT, which accesses more than 16 local variables for most of method invocations. In this case, on-the-fly profiling is effective. The hit ratio of a 16-entry register file for the FFT is increased from 44% to 83% by an array of 8-bit counters.