Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Distributed discrete-event simulation
ACM Computing Surveys (CSUR)
ACM Transactions on Computer Systems (TOCS)
The rice parallel processing testbed
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Efficient distributed event-driven simulations of multiple-loop networks
Communications of the ACM
Memory coherence in shared virtual memory systems
ACM Transactions on Computer Systems (TOCS)
Parallel discrete event simulation
Communications of the ACM - Special issue on simulation
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS)
Mechanisms for cooperative shared memory
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Efficient parallel simulation for designing multiprocessor systems
Efficient parallel simulation for designing multiprocessor systems
An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Conservative Parallel Simulation of Priority Class Queuing Networks
IEEE Transactions on Parallel and Distributed Systems
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
Fast Accurate Simulation of Large Shared Memory Multiprocessors
Fast Accurate Simulation of Large Shared Memory Multiprocessors
Optimistic simulation of parallel architectures using program executables
PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A performance analysis model for distributed simulations
WSC '96 Proceedings of the 28th conference on Winter simulation
Profit-Effective Parallel Computing
IEEE Concurrency
Cost-Effective Parallel Computing
Computer
A model for parallel simulation of distributed shared memory
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Hi-index | 0.00 |
This paper examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallel simulation is simply faster than sequential simulation, or if it is also more cost-effective. To answer this, we develop a performance model of the Wisconsin Wind Tunnel (WWT), a system that simulates cache-coherent shared-memory machines on a message-passing Thinking Machines CM-5. The performance model uses Kruskal and Weiss's fork-join model to account for the effect of event processing time variability on WWT's conservative fixed-window simulation algorithm. A generalization of Thiebaut and Stone's footprint model accurately predicts the effect of cache interference on the CM-5. The model is calibrated using parameters extracted from a fully-parallel simulation (p=N), and validated by measuring the speedup as the number of processors (p) ranges from one to the number of target nodes (N. Together with simple cost models, the performance model indicates that for target system sizes of 32 nodes and larger, parallel simulation is more cost-effective than sequential simulation. The key intuition behind this result is that large simulations require large memories, which dominate the cost of a uniprocessor; parallel computers allow multiple processors to simultaneously access this large memory.