Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Communications of the ACM
The superblock: an effective technique for VLIW and superscalar compilation
Instruction-level parallel processors
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
DIGITAL FX!32: combining emulation and binary translation
Digital Technical Journal
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
rePLay: A Hardware Framework for Dynamic Optimization
IEEE Transactions on Computers
Reconfigurable computing: a survey of systems and software
ACM Computing Surveys (CSUR)
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
FX!32: A Profile-Directed Binary Translator
IEEE Micro
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
The Chimaera reconfigurable functional unit
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 42nd annual Design Automation Conference
The Architecture of Virtual Machines
Computer
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Proceedings of the 41st annual Design Automation Conference
Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design)
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Run-time instruction set selection in a transmutable embedded processor
Proceedings of the 45th annual Design Automation Conference
Transparent reconfigurable acceleration for heterogeneous embedded applications
Proceedings of the conference on Design, automation and test in Europe
Run-Time Adaptable Architectures for Heterogeneous Behavior Embedded Systems
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution
Limits of parallelism using dynamic dependency graphs
WODA '09 Proceedings of the Seventh International Workshop on Dynamic Analysis
Hi-index | 0.00 |
In these days, every new added hardware feature must not change the underlying Instruction Set Architecture (ISA), in order to avoid adaptation or recompilation of existing code. Binary translation (BT) allows the execution of already compiled applications on different architectures. Therefore, it opens new possibilities for designers, previously tied to a specific ISA and all its legacy hardware issues. To overcome the BT inherent performance penalty, we propose a new mechanism based on a dynamic two-level binary translation system. While the first level is responsible for the BT de facto to an intermediate machine language, the second level optimizes the already translated instructions to be executed on the target architecture. The system is totally flexible: it supports the porting of radically different ISAs and the employment of different target architectures. This paper presents the first effort towards this direction: it translates code implemented in the x86 ISA to MIPS assembly (the intermediate language), which will be optimized by the target architecture: a dynamically reconfigurable array. We show that it is possible to maintain binary compatibility, with performance improvements and no energy losses, when compared to native execution.