Bytecode compression via profiled grammar rewriting

  • Authors:
  • William S. Evans;Christopher W. Fraser

  • Affiliations:
  • Computer Science Dept., University of British Columbia, Vancouver, BC;Microsoft Research, One Microsoft Way, Redmond, WA

  • Venue:
  • Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the design and implementation of a method for producing compact, bytecoded instruction sets and interpreters for them. It accepts a grammar for programs written using a simple bytecoded stack-based instruction set, as well as a training set of sample programs. The system transforms the grammar, creating an expanded grammar that represents the same language as the original grammar, but permits a shorter derivation of the sample programs and others like them. A program's derivation under the expanded grammar forms the compressed bytecode representation of the program. The interpreter for this bytecode is automatically generated from the original bytecode interpreter and the expanded grammar. Programs expressed using compressed bytecode can be substantially smaller than their original bytecode representation and even their machine code representation. For example, compression cuts the bytecode for lcc from 199KB to 58KB but increases the size of the interpreter by just over 11KB.