ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Asim: A Performance Model Framework
Computer
Dynamically Controlled Resource Allocation in SMT Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The M5 Simulator: Modeling Networked Systems
IEEE Micro
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
A statistical performance model of the opteron processor
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
MARSS: a full system simulator for multicore x86 CPUs
Proceedings of the 48th Design Automation Conference
Transformer: a functional-driven cycle-accurate multicore simulator
Proceedings of the 49th Annual Design Automation Conference
Hi-index | 0.00 |
Current microprocessors are effectively a system-on-a-chip, as they incorporate processing cores, interconnections, shared and private caches and DRAM controllers on a single die. Consequently, it is imperative to have fast and accurate simulation tools for such systems; this paper such a tool for simulating all current and announced variants of multicore processors that use the predominant PC (X86, X86-64) instruction set, as well as external DRAM memory and buses. We discuss the major techniques used for speeding up the simulation and improving the overall accuracy, and the simulation of system-level details such as coherent caches, on-chip interconnections, memory bus and DRAM. We also demonstrate a 8-fold speedup against a widely-used popular tool.