A study of devirtualization techniques for a Java Just-In-Time compiler
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Hardware Support for Control Transfers in Code Caches
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Evaluating fragment construction policies for SDT systems
Proceedings of the 2nd international conference on Virtual execution environments
Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design)
HDTrans: an open source, low-level dynamic instrumentation system
Proceedings of the 2nd international conference on Virtual execution environments
QEMU, a fast and portable dynamic translator
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization
Proceedings of the 34th annual international symposium on Computer architecture
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems
Proceedings of the International Symposium on Code Generation and Optimization
Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications
Proceedings of the International Symposium on Code Generation and Optimization
Process-shared and persistent code caches
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Characterization of DBT overhead
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
DBT path selection for holistic memory efficiency and performance
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Dynamic binary translation specialized for embedded systems
Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Binary translation using peephole superoptimizers
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
CoDBT: A multi-source dynamic binary translator using hardware-software collaborative techniques
Journal of Systems Architecture: the EUROMICRO Journal
Dynamic cache contention detection in multi-threaded applications
Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Evaluating indirect branch handling mechanisms in software dynamic translation systems
ACM Transactions on Architecture and Code Optimization (TACO)
Transparent dynamic instrumentation
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores
Proceedings of the Tenth International Symposium on Code Generation and Optimization
PinADX: an interface for customizable debugging with dynamic instrumentation
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
Dynamic binary translation system must perform an address translation for every execution of indirect branch instructions. The procedure to convert Source binary Program Counter (SPC) address to Translated Program Counter (TPC) address always takes more than 10 instructions, becoming a major source of performance overhead. This paper proposes a novel mechanism called SPc-Indexed REdirecting (SPIRE), which can significantly reduce the indirect branch handling overhead. SPIRE doesn't rely on hash lookup and address mapping table to perform address translation. It reuses the source binary code space to build a SPC-indexed redirecting table. This table can be indexed directly by SPC address without hashing. With SPIRE, the indirect branch can jump to the originally SPC address without address translation. The trampoline residing in the SPC address will redirect the control flow to related code cache. Only 2-6 instructions are needed to handle an indirect branch execution. As part of the source binary would be overwritten, a shadow page mechanism is explored to keep transparency of the corrupt source binary code page. Online profiling is adopted to reduce the memory overhead. We have implemented SPIRE on an x86 to x86 DBT system, and discussed the implementation issues on different guest and host architectures. The experiments show that, compared with hash lookup mechanism, SPIRE can reduce the performance overhead by 36.2% on average, up to 51.4%, while only 5.6% extra memory is needed. SPIRE can cooperate with other indirect branch handling mechanisms easily, and we believe the idea of SPIRE can also be applied on other occasions that need address translation.