Self-adjusting binary search trees
Journal of the ACM (JACM)
ACM Transactions on Programming Languages and Systems (TOPLAS)
An analysis of rollback-based simulation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Buffer management in shared-memory Time Warp systems
PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
Logical process size in parallel simulations
WSC '96 Proceedings of the 28th conference on Winter simulation
The dark side of risk (what your mother never told you about Time Warp)
Proceedings of the eleventh workshop on Parallel and distributed simulation
A random number generator based on the combination of four LCGs
Mathematics and Computers in Simulation - Special issue: papers presented at the MSSA/IMACS 11th biennial conference on modelling and simulation
Efficient optimistic parallel simulations using reverse computation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A comparison of simulation event list algorithms
Communications of the ACM
Information retrieval: information storage and retrieval using AVL trees
ACM '65 Proceedings of the 1965 20th national conference
Large-Scale TCP Models Using Optimistic Parallel Simulation
Proceedings of the seventeenth workshop on Parallel and distributed simulation
DSIM: scaling time warp to 1,033 processors
WSC '05 Proceedings of the 37th conference on Winter simulation
Eliminating remote message passing in optimistic simulation
Proceedings of the 38th conference on Winter simulation
Scaling time warp-based discrete event execution to 104 processors on a Blue Gene supercomputer
Proceedings of the 4th international conference on Computing frontiers
Scalable Time Warp on Blue Gene Supercomputers
PADS '09 Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation
μπ: a scalable and transparent system for simulating MPI programs
Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques
Reversible Parallel Discrete-Event Execution of Large-Scale Epidemic Outbreak Models
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation
PADS '11 Proceedings of the 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation
The IBM Blue Gene/Q interconnection network and message unit
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Modeling a leadership-scale storage system
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Modeling Large Scale Circuits Using Massively Parallel Discrete-Event Simulation
MASCOTS '12 Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Looking under the hood of the IBM blue gene/Q network
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
On deciding between conservative and optimistic approaches on massively parallel platforms
Proceedings of the Winter Simulation Conference
Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation
SCC '12 Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis
Hi-index | 0.00 |
Time Warp is an optimistic synchronization protocol for parallel discrete event simulation that coordinates the available parallelism through its rollback and antimessage mechanisms. In this paper we present the results of a strong scaling study of the ROSS simulator running Time Warp with reverse computation and executing the well-known PHOLD benchmark on Lawrence Livermore National Laboratory's Sequoia Blue Gene/Q supercomputer. The benchmark has 251 million PHOLD logical processes and was executed in several configurations up to a peak of 7.86 million MPI tasks running on 1,966,080 cores. At the largest scale it processed 33 trillion events in 65 seconds, yielding a sustained speed of 504 billion events/second using 120 racks of Sequoia. This is by far the highest event rate reported by any parallel discrete event simulation to date, whether running PHOLD or any other benchmark. Additionally, we believe it is likely to be the largest number of MPI tasks ever used in any computation of any kind to date. ROSS exhibited a super-linear speedup throughout the strong scaling study, with more than a 97x speed improvement from scaling the number of cores by only 60x (from 32,768 to 1,966,080). We attribute this to significant cache-related performance acceleration as we moved to higher scales with fewer LPs per core. Prompted by historical performance results we propose a new, long term performance metric called Warp Speed that grows logarithmically with the PHOLD event rate. As we define it our maximum speed of 504 billion PHOLD events/sec corresponds to Warp 2.7. We suggest that the results described here are significant because they demonstrate that direct simulation of planetary-scale discrete event models are now, in principle at least, within reach.