Text compression
Improving semi-static branch prediction by code replication
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Improving the accuracy of static branch prediction using branch correlation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Reducing branch costs via branch alignment
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Optimizing an ANSI C interpreter with superoperators
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Stack caching for interpreters
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
A comparative analysis of schemes for correlated branch prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The structure and performance of interpreters
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Optimizing direct threaded code by selective inlining
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Accurate indirect branch prediction
Proceedings of the 25th annual international symposium on Computer architecture
A code compression system based on pipelined interpreters
Software—Practice & Experience
Communications of the ACM
Optimising Bytecode Emulation for Prolog
PPDP '99 Proceedings of the International Conference PPDP'99 on Principles and Practice of Declarative Programming
Multi-stage Cascaded Prediction
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Vmgen: a generator of efficient virtual machine interpreters
Software—Practice & Experience
Retargeting JIT Compilers by using C-Compiler Generated Executable Code
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Context Threading: A Flexible and Efficient Dispatch Technique for Virtual Machine Interpreters
Proceedings of the international symposium on Code generation and optimization
Combining stack caching with dynamic superinstructions
Proceedings of the 2004 workshop on Interpreters, virtual machines and emulators
Code sharing among states for stack-caching interpreter
Proceedings of the 2004 workshop on Interpreters, virtual machines and emulators
Interpreting programs in static single assignment form
Proceedings of the 2004 workshop on Interpreters, virtual machines and emulators
Catenation and specialization for Tcl virtual machine performance
Proceedings of the 2004 workshop on Interpreters, virtual machines and emulators
Adapting branch-target buffer to improve the target predictability of java code
ACM Transactions on Architecture and Code Optimization (TACO)
Mixed mode execution with context threading
CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
A study of the influence of coverage on the relationship between static and dynamic coupling metrics
Science of Computer Programming - Special issue: Principles and practices of programming in Java (PPPJ 2004)
The case for virtual register machines
Science of Computer Programming - Special issue on advances in interpreters, virtual machines and emulators (IVME'03)
YETI: a graduallY extensible trace interpreter
Proceedings of the 3rd international conference on Virtual execution environments
Optimizing indirect branch prediction accuracy in virtual machine interpreters
ACM Transactions on Programming Languages and Systems (TOPLAS)
Implementing fast JVM interpreters using Java itself
Proceedings of the 5th international symposium on Principles and practice of programming in Java
Improving the performance of object-oriented languages with dynamic predication of indirect jumps
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Analyzing the performance of code-copying virtual machines
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Minimizing dependencies within generic classes for faster and smaller programs
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Virtual-Machine Abstraction and Optimization Techniques
Electronic Notes in Theoretical Computer Science (ENTCS)
Efficient interpretation using quickening
Proceedings of the 6th symposium on Dynamic languages
Interpreter instruction scheduling
CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Frequency estimation of virtual call targets for object-oriented programs
Proceedings of the 25th European conference on Object-oriented programming
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Branch strategies to optimize decision trees for wide-issue architectures
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Swift: a register-based JIT compiler for embedded JVMs
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
CVP: an energy-efficient indirect branch prediction with compiler-guided value pattern
Proceedings of the 26th ACM international conference on Supercomputing
Vectorization technology to improve interpreter performance
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Efficient interpreter optimizations for the JVM
Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools
Efficient hosted interpreters on the JVM
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Interpreters designed for efficiency execute a huge number of indirect branches and can spend more than half of the execution time in indirect branch mispredictions. Branch target buffers are the best widely available form of indirect branch prediction; however, their prediction accuracy for existing interpreters is only 2%--50%. In this paper we investigate two methods for improving the prediction accuracy of BTBs for interpreters: replicating virtual machine (VM) instructions and combining sequences of VM instructions into superinstructions. We investigate static (interpreter build-time) and dynamic (interpreter run-time) variants of these techniques and compare them and several combinations of these techniques. These techniques can eliminate nearly all of the dispatch branch mispredictions, and have other benefits, resulting in speedups by a factor of up to 3.17 over efficient threaded-code interpreters, and speedups by a factor of up to 1.3 over techniques relying on superinstructions alone.