The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Cost/performance of a parallel computer simulator
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Trojan: A High-Performance Simulator for Shared Memory Architectures
SS '96 Proceedings of the 29th Annual Simulation Symposium (SS '96)
Hi-index | 0.00 |
Fast computer simulation is an essential tool in the design of large parallel computers. Our Fast Accurate Simulation Tool, FAST, is able to accurately simulate large shared memory multiprocessors and their execution of parallel applications at simulation speeds that are one to two orders of magnitude faster than previous comparable simulators. The key ideas involve execution driven simulation techniques that modify the object code of the application program being studied. This produces an augmented version of the code that is directly executed and performs much of the work of the simulation. We extend the previous work by introducing several new uses of code augmentation. In this paper we summarize the tradeoffs made in the designs of this and previous simulators. In previous simulators, these tradeoffs have often led to sacrificing accuracy for faster simulation. However by careful selection of techniques and when to apply them, we have built a simulator that is both faster and more accurate than previous simulation systems. The improved accuracy comes from applying code augmentation techniques at a uniform low level and from having such fast context switching that accuracy/performance tradeoffs become unnecessary. Our simulator has a modular design and has been configured in many ways. It has been used to conduct numerous experiments on multithreaded machine behavior, application behavior, cache behavior, compiler optimization, and traffic patterns. Because of its high performance, we have been able to perform simulations of larger machines than would otherwise have been feasible.