Techniques for reducing overheads of shared-memory multiprocessing
ICS '95 Proceedings of the 9th international conference on Supercomputing
Reducing synchronization overhead in parallel simulation
PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
Optimistic simulation of parallel architectures using program executables
PADS '96 Proceedings of the tenth workshop on Parallel and distributed simulation
Performance benefits of virtual channels and adaptive routing: an application-driven study
ICS '97 Proceedings of the 11th international conference on Supercomputing
ICS '97 Proceedings of the 11th international conference on Supercomputing
Efficient synchronization: let them eat QOLB
Proceedings of the 24th annual international symposium on Computer architecture
Impact of Virtual Channels and Adaptive Routing on Application Performance
IEEE Transactions on Parallel and Distributed Systems
How Much Does Network Contention Affect Distributed Shared Memory Performance?
ICPP '97 Proceedings of the international Conference on Parallel Processing
Network Performance under Physical Constraints
ICPP '97 Proceedings of the international Conference on Parallel Processing
Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements
IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
HPPNetSim: a parallel simulation of large-scale interconnection networks
SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
Instruction-level simulation of a cluster at scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Adaptive and Speculative Slack Simulations of CMPs on CMPs
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Parallel simulation is emerging as the dominant technique for studying parallel computers. However the interconnection networks of these machines can be modeled at many different levels of abstraction, allowing researchers to trade off accuracy and performance. We use the Wisconsin Wind Tunnel, a parallel simulator for cache-coherent shared-memory machines, to study the trade-offs of accuracy versus performance for six different network simulation models. We evaluate these models for a variety of parallel applications, cache-coherence protocols, and topologies. We show that only the two most expensive models-which model contention at individual links-are robust in the presence of high network loads or non-uniform traffic patterns.