Parallel simulation of chip-multiprocessor architectures

  • Authors:
  • Matthew Chidester;Alan George

  • Affiliations:
  • Intel Corporation, Hillsboro, OR;University of Florida, Gainesville, FL

  • Venue:
  • ACM Transactions on Modeling and Computer Simulation (TOMACS)
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Chip-multiprocessor (CMP) architectures present a challenge for efficient simulation, combining the requirements of a detailed microprocessor simulator with that of a tightly-coupled parallel system. In this paper, a distributed simulator for target CMPs is presented based on the Message Passing Interface (MPI) designed to run on a host cluster of workstations. Microbenchmark-based evaluation is used to narrow the parallelization design space concerning the performance impact of distributed vs. centralized target L2 simulation, blocking vs. non-blocking remote cache accesses, null-message vs. barrier techniques for clock synchronization, and network interconnect selection. The best combination is shown to yield speedups of up to 16 on a 9-node cluster of dual-CPU workstations, partially due to cache effects.