The accuracy of trace-driven simulations of multiprocessors

Authors:
Stephen R. Goldschmidt;John L. Hennessy
Affiliations:
Stanford Univ., Stanford, CA;Stanford Univ., Stanford, CA
Venue:
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Year:
1993

Citing 6
Cited 24

A characterization of sharing in parallel programs and its application to coherency protocol evaluation

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Address Tracing for Parallel Machines

Computer - Special issue on experimental research in computer architecture
On the validity of trace-driven simulation for multiprocessors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Accuracy of Memory Reference Traces of Parallel Computations in Trace-Drive Simulation

IEEE Transactions on Parallel and Distributed Systems
The accuracy of trace-driven simulations of multiprocessors

The accuracy of trace-driven simulations of multiprocessors
SPLASH: Stanford parallel applications for shared-memory

SPLASH: Stanford parallel applications for shared-memory

Execution-driven tools for parallel simulation of parallel architectures and applications

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Performance evaluation of hybrid hardware and software distributed shared memory protocols

ICS '94 Proceedings of the 8th international conference on Supercomputing
Evaluating the memory overhead required for COMA architectures

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Trace-driven memory simulation: a survey

ACM Computing Surveys (CSUR)
Developing architecture adaptive algorithms using simulation with MISS-PVM for performance prediction

ICS '97 Proceedings of the 11th international conference on Supercomputing
Performance analysis on a CC-NUMA prototype

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
PSCR: A Coherence Protocol for Eliminating Passive Sharing in Shared-Bus Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Improving 3D geometry transformations on a simultaneous multithreaded SIMD processor

ICS '01 Proceedings of the 15th international conference on Supercomputing
A lightweight idempotent messaging protocol for faulty networks

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Simulation-based Testing of Communication Protocols for Dependable Embedded Systems

The Journal of Supercomputing - Special issue on embedded fault-tolerance systems
Communication in Parallel Applications: Characterization and Sensitivity Analysis

ICPP '97 Proceedings of the international Conference on Parallel Processing
An Architecture Workbench for Multicomputers

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Minerva: An Adaptive Subblock Coherence Protocol for Improved SMP Performance

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
PopSPY: A PowerPC Instrumentation Tool for Multiprocessor Simulation

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Boosting the Performance of Three-Tier Web Servers Deploying SMP Architecture

Revised Papers from the NETWORKING 2002 Workshops on Web Engineering and Peer-to-Peer Computing
Trace-Driven Memory Simulation: A Survey

Performance Evaluation: Origins and Directions
A Simulation Platform for Multi-Threaded Architectures

MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
A model for parallel simulation of distributed shared memory

MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
TOSSIM: accurate and scalable simulation of entire TinyOS applications

Proceedings of the 1st international conference on Embedded networked sensor systems
A fast and accurate framework to analyze and optimize cache memory behavior

ACM Transactions on Programming Languages and Systems (TOPLAS)
Architecture optimization for multimedia application exploiting data and thread-level parallelism

Journal of Systems Architecture: the EUROMICRO Journal
ALITER: an asynchronous lightweight instrumentation tool for event recording

ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
How to simulate 1000 cores

ACM SIGARCH Computer Architecture News
Using platform-specific performance counters for dynamic compilation

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In trace-driven simulation, traces generated for one set of system characteristics are used to simulate a system with different characteristics. However, the execution path of a multiprocessor workload may depend on the order of events occurring on different processing elements. The event order, in turn, depends on system charcteristics such as memory-system latencies and buffer-sizes. Trace-driven simulations of multiprocessor workloads are inaccurate unless the dependencies are eliminated from the traces.We have measured the effects of these inaccuracies by comparing trace-driven simulations to direct simulations of the same workloads. The simulators predicted identical performance only for workloads whose traces were timing-independent. Workloads that used first-come first-served scheduling and/or non-deterministic algorithms produced timing-dependent traces, and simulation of these traces produced inaccurate performance predictions. Two types of performance metrics were particularly affected: those related to synchronization latency and those derived from relatively small numbers of events. To accurately predict such performance metrics, timing-independent traces or direct simulation should be used.