Optimizing direct threaded code by selective inlining
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Static and Dynamic Program Compilation by Interpreter Specialization
Higher-Order and Symbolic Computation
Linux Journal
Vmgen: a generator of efficient virtual machine interpreters
Software—Practice & Experience
Optimizing indirect branch prediction accuracy in virtual machine interpreters
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
SableCC, an Object-Oriented Compiler Framework
TOOLS '98 Proceedings of the Technology of Object-Oriented Languages and Systems
Engineering a customizable intermediate representation
Proceedings of the 2003 workshop on Interpreters, virtual machines and emulators
Multicodes: optimizing virtual machines using bytecode sequences
OOPSLA '03 Companion of the 18th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Java-through-C Compilation: An Enabling Technology for Java in Embedded Systems
Proceedings of the conference on Design, automation and test in Europe - Volume 3
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A retrospective on: "an evaluation of staged run-time optimizations in DyC"
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Overview of the IBM Java just-in-time compiler
IBM Systems Journal
A portable research framework for the execution of java bytecode
A portable research framework for the execution of java bytecode
Retargeting JIT Compilers by using C-Compiler Generated Executable Code
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
A tour of tempo: a program specializer for the C language
Science of Computer Programming - Special issue on program transformation
Context Threading: A Flexible and Efficient Dispatch Technique for Virtual Machine Interpreters
Proceedings of the international symposium on Code generation and optimization
Code sharing among states for stack-caching interpreter
Proceedings of the 2004 workshop on Interpreters, virtual machines and emulators
YARV: yet another RubyVM: innovating the ruby interpreter
OOPSLA '05 Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Mixed mode execution with context threading
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Compiler-guaranteed safety in code-copying virtual machines
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
A graph theoretic approach to cache-conscious placement of data for direct mapped caches
Proceedings of the 2010 international symposium on Memory management
Hi-index | 0.00 |
Many popular programming languages use interpreter-based execution for portability, supporting dynamic or reflective properties, and ease of implementation. Code-copying is an optimization technique for interpreters that reduces the performance gap between interpretation and JIT compilation, offering significant speedups over direct-threading interpretation. Due to varying language features and virtual machine design, however, not all languages benefit from codecopying to the same extent. We consider here properties of interpreted languages, and in particular bytecode and virtual machine construction that enhance or reduce the impact of code-copying. We implemented code-copying and compared performance with the original direct-threading virtual machines for three languages, Java (SableVM), OCaml, and Ruby (Yarv), examining performance on three different architectures, ia32 (Pentium 4), x86_64 (AMD64) and PowerPC (G5). Best speedups are achieved on ia32 by OCaml (maximum 4.88 times, 2.81 times on average), where a small and simple bytecode design facilitates improvements to branch prediction brought by code-copying. Yarv only slightly improves over direct-threading; large working sizes of bytecodes, and a relatively small fraction of time spent in the actual interpreter loop both limit the application of codecopying and its overall net effect. We are able to show that simple ahead of time analysis of VM and execution properties can help determine the suitability of code-copying for a particular VM before an implementation of code-copying is even attempted.