Mimic: a fast system/370 simulator
SIGPLAN '87 Papers of the Symposium on Interpreters and interpretive techniques
DAISY: dynamic compilation for 100% architectural compatibility
Proceedings of the 24th annual international symposium on Computer architecture
Binary translation and architecture convergence issues for IBM system/390
Proceedings of the 14th international conference on Supercomputing
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Execution-Based Scheduling for VLIW Architectures
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Retargetable and reconfigurable software dynamic translation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An efficient and generic reversible debugger using the virtual machine based approach
Proceedings of the 1st ACM/USENIX international conference on Virtual execution environments
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Improving Region Selection in Dynamic Optimization Systems
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
HDTrans: an open source, low-level dynamic instrumentation system
Proceedings of the 2nd international conference on Virtual execution environments
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems
Proceedings of the International Symposium on Code Generation and Optimization
PinOS: a programmable framework for whole-system dynamic instrumentation
Proceedings of the 3rd international conference on Virtual execution environments
Hi-index | 0.00 |
A dynamic binary translator (DBT) is a runtime system that translates binary code on the fly, for example to emulate the execution of the binary code on a processor with a different instruction set. One of the major sources of the overhead is the resolution of the branch target addresses for indirect branch instructions. Previous work has addressed this problem for a single virtual address space, but none has addressed it for multiple virtual address spaces in the context of the system-level DBT. This is challenging for compiler optimizations because the compiler cannot compute the virtual addresses of the branch targets for indirect branches at compile-time since they are affected by the runtime states of the emulated TLB. In this paper, we propose a new compiler optimization technique to address the problem for a system-level DBT. Our key idea is to use an offset from the virtual address of each page that contains a branch instruction, since this offset is not affected by the emulated TLB. We found that the compiler can often compute the offset using compile-time constants and that this approach significantly simplifies the guard code necessary for an indirect branch. We implemented this technique in a compiler of a system-level DBT for the z/Architecture. Our experimental results showed our technique can reduce the execution times of the CBW2 benchmarks, part of the standard LSPR benchmark, by up to 5.9% and 2.5% on average. Our analysis indicated that our technique was able to optimize 3.8% of the total dynamic instructions in the original binary code, while completely removing the guard code for 98.9% of these indirect branches.