Warp speed: executing time warp on 1,966,080 cores

  • Authors:
  • Peter D. Barnes, Jr.;Christopher D. Carothers;David R. Jefferson;Justin M. LaPre

  • Affiliations:
  • Lawrence Livermore National Laboratory, Livermore, CA, USA;Rensselaer Polytechnic Institute, Troy, NY, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;Rensselaer Polytechnic Institute, Troy, NY, USA

  • Venue:
  • Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Time Warp is an optimistic synchronization protocol for parallel discrete event simulation that coordinates the available parallelism through its rollback and antimessage mechanisms. In this paper we present the results of a strong scaling study of the ROSS simulator running Time Warp with reverse computation and executing the well-known PHOLD benchmark on Lawrence Livermore National Laboratory's Sequoia Blue Gene/Q supercomputer. The benchmark has 251 million PHOLD logical processes and was executed in several configurations up to a peak of 7.86 million MPI tasks running on 1,966,080 cores. At the largest scale it processed 33 trillion events in 65 seconds, yielding a sustained speed of 504 billion events/second using 120 racks of Sequoia. This is by far the highest event rate reported by any parallel discrete event simulation to date, whether running PHOLD or any other benchmark. Additionally, we believe it is likely to be the largest number of MPI tasks ever used in any computation of any kind to date. ROSS exhibited a super-linear speedup throughout the strong scaling study, with more than a 97x speed improvement from scaling the number of cores by only 60x (from 32,768 to 1,966,080). We attribute this to significant cache-related performance acceleration as we moved to higher scales with fewer LPs per core. Prompted by historical performance results we propose a new, long term performance metric called Warp Speed that grows logarithmically with the PHOLD event rate. As we define it our maximum speed of 504 billion PHOLD events/sec corresponds to Warp 2.7. We suggest that the results described here are significant because they demonstrate that direct simulation of planetary-scale discrete event models are now, in principle at least, within reach.