The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Full-system timing-first simulation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
QEMU, a fast and portable dynamic translator
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
COTSon: infrastructure for full system simulation
ACM SIGOPS Operating Systems Review
MPTLsim: a simulator for X86 multicore processors
Proceedings of the 46th Annual Design Automation Conference
Accurate Functional-First Multicore Simulators
IEEE Computer Architecture Letters
A case for FAME: FPGA architecture model execution
Proceedings of the 37th annual international symposium on Computer architecture
RAMP gold: an FPGA-based architecture simulator for multiprocessors
Proceedings of the 47th Design Automation Conference
Adaptive and Speculative Slack Simulations of CMPs on CMPs
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
COREMU: a scalable and portable parallel full-system emulator
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
ACM SIGARCH Computer Architecture News
MARSS: a full system simulator for multicore x86 CPUs
Proceedings of the 48th Design Automation Conference
An early memory hierarchy evaluation simulator for multimedia applications
Microprocessors & Microsystems
Hi-index | 0.00 |
Full-system simulators are extremely useful in evaluating design alternatives for multicore. However, state-of-the-art multicore simulators either lack good extensibility due to their tightly-coupled design between functional model (FM) and timing model (TM), or cannot guarantee cycle-accuracy. This paper conducts a comprehensive study on factors affecting cycle-accuracy and uncovers several contributing factors ignored before. Based on the study, we propose a loosely-coupled functional-driven full-system simulator for multicore, namely Transformer. To ensure extensibility and cycle-accuracy, Transformer leverages an architecture-independent interface between FM and TM and uses a lightweight scheme to detect and recover from execution divergence between FM and TM. Based on Transformer, a graduate student only needs to write about 180 lines of code and takes about two months to extend an X86 functional model (QEMU) in Transformer. Moreover, the loosely-coupled design also removes the complex interaction between FM and TM and opens the opportunity to parallelize FM and TM to improve performance. Experimental results show that Transformer achieves an average of 8.4% speedup over GEMS while guaranteeing the cycle-accuracy. A further parallelization between FM and TM leads to 35.3% speedup.