ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
The rice parallel processing testbed
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
TRAPEDS: producing traces for multicomputers via execution driven simulation
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Efficient simulation of cache memories
WSC '89 Proceedings of the 21st conference on Winter simulation
Techniques for efficient inline tracing on a shared-memory multiprocessor
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Generation and analysis of very long address traces
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Implementing a cache consistency protocol
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Cache Performance in the VAX-11/780
ACM Transactions on Computer Systems (TOCS)
Portable Programs for Parallel Processors
Portable Programs for Parallel Processors
A Workbench for Computer Architects
IEEE Design & Test
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
Tango: A Multiprocessor Simulation and Tracing System
Tango: A Multiprocessor Simulation and Tracing System
Mint Tutorial and User Manual
Parallelized Direct Execution Simulation of Message-Passing Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Experiences in modeling and simulation of computer architectures in DEVS
Transactions of the Society for Computer Simulation International - Recent advances in DEVS methodology--part II
Variability in Architectural Simulations of Multi-Threaded Workloads
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Simulation of Heterogeneous Networks of Workstations
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Hi-index | 0.00 |
This article describes and evaluates an efficient execution-driven technique for the simulation of multiprocessors that includes the simulation of system memory and that is driven by real program work loads. The technique produces correctly interleaved address traces at run-time without disk access overhead or hardware support, allowing accurate simulation of the effects of a variety of architectural alternatives on programs. We have implemented a simulator based on this technique that offers substantial advantages in terms of reduced time and space overheads when compared to instruction-driven or trace-driven simulation techniques, without significant loss of accuracy. The article presents the results of several validation experiments used to quantify the accuracy and efficiency of the simulator for sequential, distributed, and shared-memory multiprocessors, and several parallel programs. These experiments show that prediction errors of less than 5 percent as compared to actual execution times, and overheads 6 to 30 times lower than those incurred by cycle-level simulation can be achieved. Predictions of relative performance metrics such as speedup tend to be even more accurate, making this technique especially attractive as an efficient method for comparative investigations of parallel system designs.