Statistical sampling of microarchitecture simulation

  • Authors:
  • Thomas F. Wenisch;Roland E. Wunderlich;Babak Falsafi;James C. Hoe

  • Affiliations:
  • Computer Architecture Laboratory, Carnegie Mellon University, Pittsburgh, PA;Computer Architecture Laboratory, Carnegie Mellon University, Pittsburgh, PA;Computer Architecture Laboratory, Carnegie Mellon University, Pittsburgh, PA;Computer Architecture Laboratory, Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current software-based microarchitecture simulators are many orders of magnitude slower than the hardware they simulate. Hence, most microarchitecture design studies draw their conclusions from drastically truncated benchmark simulations that are often inaccurate and misleading. The Sampling Microarchitecture Simulation (SMARTS) framework is an approach to enable fast and accurate performance measurements of full-length benchmarks. SMARTS accelerates simulation by selectively measuring in detail only an appropriate benchmark subset. SMARTS prescribes a statistically sound procedure for configuring a systematic sampling simulation run to achieve a desired quantifiable confidence in estimates. Analysis of the SPEC CPU2000 benchmark suite shows that CPI can be estimated to within ± 3% with 99.7% confidence by measuring fewer than 50 million instructions per benchmark. In practice, inaccuracy in microarchitectural state initialization introduces an additional uncertainty which we empirically bound to ∼2% for the tested benchmarks. We present two implementations of SMARTS that both achieve an average error of only 0.64% on CPI. SMARTSim constructs accurate model state through functional warming-continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while functionally simulating the billions of instructions between measurements-reducing average simulation turnaround from 5.5 days to 7.0 hours. TurboSMARTSim replaces functional warming with live-points-checkpoints that store a bare minimum of functionally-warmed state for accurate simulation of a limited execution window-further reducing average turnaround to 91 seconds.