ACM Transactions on Programming Languages and Systems (TOPLAS)
Time warp on a shared memory multiprocessor
Transactions of the Society for Computer Simulation International
High-performance computer architecture (2nd ed.)
High-performance computer architecture (2nd ed.)
Shared variables in distributed simulation
PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
Distributed Simulation of Large-Scale PCS Networks
MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
Adaptive memory management and optimism control in time warp
ACM Transactions on Modeling and Computer Simulation (TOMACS)
An Empirical Evaluation of Performance-Memory Trade-Offs in Time Warp
IEEE Transactions on Parallel and Distributed Systems
WSC '96 Proceedings of the 28th conference on Winter simulation
Adaptive flow control in time warp
Proceedings of the eleventh workshop on Parallel and distributed simulation
Scheduling critical channels in conservative parallel discrete event simulation
PADS '99 Proceedings of the thirteenth workshop on Parallel and distributed simulation
Time Warp simulation on clumps
PADS '99 Proceedings of the thirteenth workshop on Parallel and distributed simulation
Proceedings of the 31st conference on Winter simulation: Simulation---a bridge to the future - Volume 2
ROSS: a high-performance, low memory, modular time warp system
PADS '00 Proceedings of the fourteenth workshop on Parallel and distributed simulation
Characterizing and Understanding PDES Behavior on Tilera Architecture
PADS '12 Proceedings of the 2012 ACM/IEEE/SCS 26th Workshop on Principles of Advanced and Distributed Simulation
Warp speed: executing time warp on 1,966,080 cores
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Hi-index | 0.00 |
Mechanisms for managing message buffers in Time Warp parallel simulations executing on cache-coherent shared-memory multiprocessors are studied. Two simple buffer management strategies called the sender pool and receiver pool mechanisms are examined with respect to their efficiency, and in particular, their interaction with multiprocessor cache-coherence protocols. Measurements of implementations on a Kendall Square Research KSR-2 machine using both synthetic workloads and benchmark applications demonstrate that sender pools offer significant performance advantages over receiver pools. However, it is also observed that both schemes, especially the sender pool mechanism, are prone to severe performance degradations due to poor locality of reference in large simulations using substantial amounts of message buffer memory. A third strategy called the partitioned buffer pool approach is proposed that exploits the advantages of sender pools, but exhibits much better locality. Measurements of this approach indicate that the partitioned pool mechanism yields substantially better performance than both the sender and receiver pool schemes for large-scale, small-granularity parallel simulation applications.The central conclusions from this study are: (1) buffer management strategies play an important role in determining the overall efficiency of multiprocessor-based parallel simulators, and (2) the partitioned buffer pool organization offers significantly better performance than the sender and receiver pool schemes. These studies demonstrate that poor performance may result if proper attention is not paid to realizing an efficient buffer management mechanism.