Are Bytecodes an Atavism?

Authors:
Theo D'Hondt
Affiliations:
Programming Technology Lab, Vrije Universiteit Brussel, Brussels, Belgium B1050
Venue:
Self-Sustaining Systems
Year:
2008

Citing 5
Cited 2

Programming in MODULA-2 (3rd corrected ed.)

Programming in MODULA-2 (3rd corrected ed.)
Revised5 report on the algorithmic language scheme

ACM SIGPLAN Notices
Pascal: The Language and Its Implementation

Pascal: The Language and Its Implementation
Efficient implementation of the smalltalk-80 system

POPL '84 Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Self

Proceedings of the third ACM SIGPLAN conference on History of programming languages

Dynamic parallelization of recursive code: part 1: managing control flow interactions with the continuator

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Pinocchio: bringing reflection to life with first-class interpreters

Proceedings of the ACM international conference on Object oriented programming systems languages and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The notion of bytecodes can be traced back to the 60's with BCPL O-codes. These were essentially used to pursue platform independence. Later, with Pascal p-codes and Smalltalk bytecodes the objective shifted to the concept of virtual machines as precursors to dedicated hardware implementations, culminating in Lilith and SOAR. More recently, Java adopted a similar approach, but with the advent of efficient JIT-technology, bytecodes resumed their role as intermediary representation of programs written in some higher level language. It is our conjecture that using bytecodes in this capacity is an atavism, a throwback to times where hardware bytecode machines were the ultimate target. We suggest that the question of an optimal intermediary representation must be raised. In this paper we investigate the exact opposite of the bytecode approach: we define an intermediary notation which is as close as possible to the semantics of the programming language under consideration. It is then a question of applying the correct compiler technology to produce an efficient JIT strategy for generating efficient machine code. A more interesting question addressed here is whether a virtual machine can be built using this strategy that matches a bytecode interpreter in perceived performance, while giving the running program much more control over its execution than is the case in the bytecode approach. We investigate a totally non-compromise approach, where a unified memory architecture is used to host all structures relevant during program execution, including program data structures, program representation, interpreter caches and runtime stacks. We existentially prove that it is possible to build a virtual machine along these lines that can match a bytecode implementation in performance while giving much more "self" control to the running program. Two cases are presented here: the Pico language and virtual machine which were co-designed with the unified memory approach in mind, and a Scheme virtual machine intended to match the performance of PLT-Scheme.