Decompilation of Java bytecode to Prolog by partial evaluation

  • Authors:
  • Miguel Gómez-Zamalloa;Elvira Albert;Germán Puebla

  • Affiliations:
  • DSIC, Complutense University of Madrid, E-28040 Madrid, Spain;DSIC, Complutense University of Madrid, E-28040 Madrid, Spain;CLIP, Technical University of Madrid, E-28660 Boadilla del Monte, Madrid, Spain

  • Venue:
  • Information and Software Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reasoning about Java bytecode (JBC) is complicated due to its unstructured control-flow, the use of three-address code combined with the use of an operand stack, etc. Therefore, many static analyzers and model checkers for JBC first convert the code into a higher-level representation. In contrast to traditional decompilation, such representation is often not Java source, but rather some intermediate language which is a good input for the subsequent phases of the tool. Interpretive decompilation consists in partially evaluating an interpreter for the compiled language (in this case JBC) written in a high-level language with respect to the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, but there remain important open issues when it comes to decompile a real language such as JBC. This paper presents, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of a realistic programming language to a high-level representation, namely of JBC to Prolog. We introduce two notions of optimality which together require that decompilation does not generate code more than once for each program point. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demonstrates empirically the scalability of modular decompilation of JBC by partial evaluation.