CRAW/P: a workload partition method for the efficient parallel simulation of manycores

Authors:
Shuai Jiao;Paolo Ienne;Xiaochun Ye;Da Wang;Dongrui Fan;Ninghui Sun
Affiliations:
SKL Computer Architecture, ICT, CAS, Beijing, P.R. China, Graduate University of Chinese Academy of Sciences, Beijing, P.R. China;École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland;SKL Computer Architecture, ICT, CAS, Beijing, P.R. China;SKL Computer Architecture, ICT, CAS, Beijing, P.R. China;SKL Computer Architecture, ICT, CAS, Beijing, P.R. China;SKL Computer Architecture, ICT, CAS, Beijing, P.R. China
Venue:
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Year:
2012

Citing 15
Cited 0

Time warp operating system

SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
GTW: a time warp system for shared memory multiprocessors

WSC '94 Proceedings of the 26th conference on Winter simulation
Wisconsin Wind Tunnel II: A Fast, Portable Parallel Architecture Simulator

IEEE Concurrency
Simics: A Full System Simulation Platform

Computer
Parallel simulation of chip-multiprocessor architectures

ACM Transactions on Modeling and Computer Simulation (TOMACS)
The M5 Simulator: Modeling Networked Systems

IEEE Micro
Distributed Simulation: A Case Study in Design and Verification of Distributed Programs

IEEE Transactions on Software Engineering
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Parallelization of IBM mambo system simulator in functional modes

ACM SIGOPS Operating Systems Review
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Rigel: an architecture and scalable programming interface for a 1000-core accelerator

Proceedings of the 36th annual international symposium on Computer architecture
How to simulate 1000 cores

ACM SIGARCH Computer Architecture News
SlackSim: a platform for parallel simulations of CMPs on CMPs

ACM SIGARCH Computer Architecture News
P-GAS: Parallelizing a Cycle-Accurate Event-Driven Many-Core Processor Simulator Using Parallel Discrete Event Simulation

PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the workload partition strategies in the simulation of manycore architectures. The key observation behind this paper is that, compared to traditional multicores, manycores feature more non-uniform memory access and unpredictable network traffic; these features degrades simulation speed and accuracy of Parallel Discrete Event Simulators (PDES) when one uses static workload partition schemes. Based on the observation, we propose an adaptive workload partition method: Core/Router-Adaptive Workload Partition (CRAW/P). The method delivers more speedup and accuracy than static partition schemes by partitioning the simulation of on-chip-network independently from that of the cores and by synchronizing them differently. Using a PDES simulator, we evaluate the performance of CRAW/P in simulating a 256-core general purpose many-core processor. Running SPLASH2 benchmark applications, the experimental results demonstrate it can deliver speed improvement by 28%˜67% over static partition scheme and reduces timing errors to