Fast out-of-order processor simulation using memoization
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Measuring Experimental Error in Microprocessor Simulation
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Asim: A Performance Model Framework
Computer
Design and Implementation of a Parallel Verilog Simulator: PVSim
VLSID '04 Proceedings of the 17th International Conference on VLSI Design
QEMU, a fast and portable dynamic translator
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
A-Ports: an efficient abstraction for cycle-accurate performance models on FPGAs
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
A-Port Networks: Preserving the Timed Behavior of Synchronous Systems for Modeling on FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Soft connections: addressing the hardware-design modularity problem
Proceedings of the 46th Annual Design Automation Conference
Bounded dataflow networks and latency-insensitive circuits
MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
ReSim, a trace-driven, reconfigurable ILP processor simulator
Proceedings of the Conference on Design, Automation and Test in Europe
FastFwd: an efficient hardware acceleration technique for trace-driven network-on-chip simulation
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Exploiting temporal decoupling to accelerate trace-driven NoC emulation
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Hi-index | 0.00 |
This paper describes the FAST methodology that enables a single FPGA to accelerate the performance of cycle-accurate computer system simulators modeling modern, realistic SoCs, embedded systems and standard desktop/laptop/server computer systems. The methodology partitions a simulator into (i) a functional model that simulates the functionality of the computer system and (ii) a predictive model that predicts performance and other metrics. The partitioning is crafted to map most of the parallel work onto a hardware-based predictive model, eliminating much of the complexity and difficulty of simulating parallel constructs on a sequential platform. FAST conventions and libraries have been designed to make creating, modifying, using and measuring such simulators straightforward. We describe a prototype FAST system: a full-system, RTL-level cycle-accurate-capable computer system simulator that executes the x86 ISA, boots unmodified Linux and executes unmodified x86 applications. The prototype runs two to three orders of magnitude faster than the fastest Intel and AMD RTL-level cycle-accurate x86 software-based simulators and about six to seven times faster than RTL simulation.