Enhancing operating system support for multicore processors by using hardware performance monitoring
ACM SIGOPS Operating Systems Review
Dynamic elimination of overflow tests in a trace compiler
CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Parameter based constant propagation
SBLP'12 Proceedings of the 16th Brazilian conference on Programming Languages
Hi-index | 0.00 |
The implementation of new programming languages benefits from interpretation because it is simple, flexible and portable. The only downside is speed of execution, as there remains a large performance gap between even efficient interpreters and systems that include a just-in-time (JIT) compiler. Augmenting an interpreter with a JIT, however, is not a small task. Today, Java JITs are typically method-based. To compile whole methods, the JIT must re-implement much functionality already provided by the interpreter, leading to a “big bang” development effort before the JIT can be deployed. Adding a JIT to an interpreter would be easier if we could more gradually shift from dispatching virtual instructions bodies implemented for the interpreter to running instructions compiled into native code by the JIT. We show that virtual instructions implemented as lightweight callable routines can form the basis for a very efficient interpreter. Our new technique, interpreted traces, identifies hot paths, or traces, as a virtual program is interpreted. By exploiting the way traces predict branch destinations our technique markedly reduces branch mispredictions caused by dispatch. Interpreted traces are a high-performance technique, running about 25% faster than direct threading. We show that interpreted traces are a good starting point for a trace-based JIT. We extend our interpreter so traces may contain a mixture of compiled code for some virtual instructions and calls to virtual instruction bodies for others. By compiling about 50 integer and object virtual instructions to machine code we improve performance by about 30% over interpreted traces, running about twice as fast as the direct threaded system with which we started.