PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling

  • Authors:
  • Chuntao Jiang;Zhibin Yu;Hai Jin;Chengzhong Xu;Lieven Eeckhout;Wim Heirman;Trevor E. Carlson;Xiaofei Liao

  • Affiliations:
  • Huazhong University of Science and Technology, Wuhan, China;Shenzhen Institute of Advanced Technology, CAS;Huazhong University of Science and Technology, Wuhan, China;Shenzhen Institute of Advanced Technology/Wayne State University;Ghent University, Belgium;Ghent University, Belgium;Ghent University, Belgium;Huazhong University of Science and Technology, Wuhan, China

  • Venue:
  • ACM Transactions on Architecture and Code Optimization (TACO)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Computer architects rely heavily on microarchitecture simulation to evaluate design alternatives. Unfortunately, cycle-accurate simulation is extremely slow, being at least 4 to 6 orders of magnitude slower than real hardware. This longstanding problem is further exacerbated in the multi-/many-core era, because single-threaded simulation performance has not improved much, while the design space has expanded substantially. Parallel simulation is a promising approach, yet does not completely solve the simulation challenge. Furthermore, existing sampling techniques, which are widely used for single-threaded applications, do not readily apply to multithreaded applications as thread interaction and synchronization must now be taken into account. This work presents PCantorSim, a novel Cantor set (a classic fractal)--based sampling scheme to accelerate parallel simulation of multithreaded applications. Through the use of the proposed methodology, only less than 5% of an application's execution time is simulated in detail. We have implemented our approach in Sniper (a parallel multicore simulator) and evaluated it by running the PARSEC benchmarks on a simulated 8-core system. The results show that PCantorSim increases simulation speed over detailed parallel simulation by a factor of 20×, on average, with an average absolute execution time prediction error of 5.3%.