Text compression
Improving semi-static branch prediction by code replication
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Improving the accuracy of static branch prediction using branch correlation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Reducing branch costs via branch alignment
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Optimizing an ANSI C interpreter with superoperators
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Stack caching for interpreters
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
A comparative analysis of schemes for correlated branch prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The structure and performance of interpreters
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improving the Accuracy of History-Based Branch Prediction
IEEE Transactions on Computers
Optimizing direct threaded code by selective inlining
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Accurate indirect branch prediction
Proceedings of the 25th annual international symposium on Computer architecture
A code compression system based on pipelined interpreters
Software—Practice & Experience
Communications of the ACM
Optimising Bytecode Emulation for Prolog
PPDP '99 Proceedings of the International Conference PPDP'99 on Principles and Practice of Declarative Programming
Multi-stage Cascaded Prediction
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Vmgen: a generator of efficient virtual machine interpreters
Software—Practice & Experience
Optimizing indirect branch prediction accuracy in virtual machine interpreters
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Buffering databse operations for enhanced instruction cache performance
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A portable research framework for the execution of java bytecode
A portable research framework for the execution of java bytecode
Context Threading: A Flexible and Efficient Dispatch Technique for Virtual Machine Interpreters
Proceedings of the international symposium on Code generation and optimization
Adapting branch-target buffer to improve the target predictability of java code
ACM Transactions on Architecture and Code Optimization (TACO)
Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design)
SableVM: a research framework for the efficient execution of java bytecode
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
Effective inline-threaded interpretation of Java bytecode using preparation sequences
CC'03 Proceedings of the 12th international conference on Compiler construction
Tiger – an interpreter generation tool
CC'05 Proceedings of the 14th international conference on Compiler Construction
Optimization strategies for a java virtual machine interpreter on the cell broadband engine
Proceedings of the 5th conference on Computing frontiers
Proceedings of the Third Workshop on Virtual Machines and Intermediate Languages
Design of a real-time optimized emulation method
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Proceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software
Hi-index | 0.00 |
Interpreters designed for efficiency execute a huge number of indirect branches and can spend more than half of the execution time in indirect branch mispredictions. Branch target buffers (BTBs) are the most widely available form of indirect branch prediction; however, their prediction accuracy for existing interpreters is only 2%--50%. In this article we investigate two methods for improving the prediction accuracy of BTBs for interpreters: replicating virtual machine (VM) instructions and combining sequences of VM instructions into superinstructions. We investigate static (interpreter build-time) and dynamic (interpreter runtime) variants of these techniques and compare them and several combinations of these techniques. To show their generality, we have implemented these optimizations in VMs for both Java and Forth. These techniques can eliminate nearly all of the dispatch branch mispredictions, and have other benefits, resulting in speedups by a factor of up to 4.55 over efficient threaded-code interpreters, and speedups by a factor of up to 1.34 over techniques relying on dynamic superinstructions alone.