Shade: a fast instruction-set simulator for execution profiling
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Embra: fast and flexible machine simulation
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
FX!32: A Profile-Directed Binary Translator
IEEE Micro
An infrastructure for adaptive dynamic optimization
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
RIFLE: An Architectural Framework for User-Centric Information-Flow Security
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Constructing Virtual Architectures on a Tiled Processor
Proceedings of the International Symposium on Code Generation and Optimization
A comparison of software and hardware techniques for x86 virtualization
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
QEMU, a fast and portable dynamic translator
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Dynamic compilation: the benefits of early investing
Proceedings of the 3rd international conference on Virtual execution environments
VirtualBox: bits and bytes masquerading as machines
Linux Journal
FaCSim: a fast and cycle-accurate architecture simulator for embedded systems
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
CFS Optimizations to KVM Threads on Multi-Core Environment
ICPADS '09 Proceedings of the 2009 15th International Conference on Parallel and Distributed Systems
Hi-index | 0.00 |
Time-to-market is a critical factor in the commercial success of new consumer devices. To minimise delays, system developers and third party software vendors must be able to test their applications before the hardware platform becomes available. Instruction Set Simulators (ISS's) underpin this early development by emulating new platforms on ordinary desktop machines. As target platforms become faster the performance demands on ISS's become greater. A key challenge is to leverage available simulator technology to produce, at low cost, incremental performance gains needed to keep up with these demands. In this work we use a very simple strategy: in-place-block-replacement to produce improvements in the performance of the popular QEMU functional simulator. The replacement blocks are generated at runtime using the LLVM JIT running on spare processor cores. This strategy provides a very lightweight way to incrementally build an alternate code generator within an existing ISS framework without incurring a substantial runtime cost. We show the approach is effective in reducing the runtimes of the QEMU user-space emulator on a number of SPECint 2006 benchmarks.