The rice parallel processing testbed
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Interprocedural slicing using dependence graphs
ACM Transactions on Programming Languages and Systems (TOPLAS)
PROTEUS: a high-performance parallel-architecture simulator
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A distributed memory LAPSE: parallel simulation of message-passing programs
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Reducing synchronization overhead in parallel simulation
PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
Optimistic simulation of parallel architectures using program executables
PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
Parallelized Direct Execution Simulation of Message-Passing Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Using the SimOS machine simulator to study complex computer systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Using integer sets for data-parallel program analysis and optimization
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
MPI-SIM: using parallel simulation to evaluate MPI programs
Proceedings of the 30th conference on Winter simulation
Performance prediction of large parallel applications using parallel simulations
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler-supported simulation of highly scalable parallel applications
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Asynchronous Parallel Simulation of Parallel Programs
IEEE Transactions on Software Engineering
Complete Computer System Simulation: The SimOS Approach
IEEE Parallel & Distributed Technology: Systems & Technology
An adaptive synchronization method for unpredictable communication patterns in dataparallel programs
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
International Journal of High Performance Computing Applications
Parallel simulation: parallel and distributed simulation systems
Proceedings of the 33nd conference on Winter simulation
Compiler-optimized simulation of large-scale applications on high performance architectures
Journal of Parallel and Distributed Computing - Parallel and Distributed Discrete Event Simulation--An Emerging Technology
Parallel simulation: distributed simulation systems
Proceedings of the 35th conference on Winter simulation: driving innovation
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Evolutionary performance-oriented development of parallel programs by composition of components
Proceedings of the 5th international workshop on Software and performance
Improving Lookahead in Parallel Multiprocessor Simulation Using Dynamic Execution Path Prediction
Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation
Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
Micro-Synchronization in Conservative Parallel Network Simulation
Proceedings of the 22nd Workshop on Principles of Advanced and Distributed Simulation
A time management optimization framework for large-scale distributed hardware-in-the-loop simulation
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Hi-index | 0.00 |
This paper addresses the issue of efficient and accurate performance prediction of large-scale message-passing applications on high performance architectures using simulation. Such simulators are often based on parallel discrete event simulation, typically using the conservative protocol to synchronize the simulation threads. The paper considers how a compiler can be used to automatically extract information about the lookahead present in the application, and how this can be used to improve the performance of the null protocol used for synchronization. These techniques are implemented in the MPI-Sim simulator and dHPF compiler, which had previously been extended to work together for optimizing the simulation of local computational components of an application. The results shows that the availability of lookahead information improves the runtime of the simulator by factors ranging from 9% up to two orders of magnitude, with 30-60% improvements being typical for the real-world codes. The experiments also show that these improvements are directly correlated with reductions in the number of null messages required by the simulations.