Reducing Null Messages in Misra's Distributed Discrete Event Simulation Method
IEEE Transactions on Software Engineering
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
International Journal of Parallel Programming - Special issue on instruction-level parallel processing—part II
ICS '98 Proceedings of the 12th international conference on Supercomputing
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
IEEE Transactions on Computers
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Architecture of the Atlas Chip-Multiprocessor: Dynamically Parallelizing Irregular Applications
IEEE Transactions on Computers
Parallel and Distribution Simulation Systems
Parallel and Distribution Simulation Systems
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Computer
A Case for NOW (Networks of Workstations)
IEEE Micro
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
The Alpha 21264 Microprocessor Architecture
ICCD '98 Proceedings of the International Conference on Computer Design
Mint Tutorial and User Manual
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Full-system chip multiprocessor power evaluations using FPGA-based emulation
Proceedings of the 13th international symposium on Low power electronics and design
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
ACM SIGARCH Computer Architecture News
SlackSim: a platform for parallel simulations of CMPs on CMPs
ACM SIGARCH Computer Architecture News
SlackSim: a platform for parallel simulations of CMPs on CMPs
ACM SIGMETRICS Performance Evaluation Review
Parallel simulation of chip-multiprocessor by using multi-threading
AsiaMS '07 Proceedings of the IASTED Asian Conference on Modelling and Simulation
Software—Practice & Experience
FastFwd: an efficient hardware acceleration technique for trace-driven network-on-chip simulation
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Adaptive and Speculative Slack Simulations of CMPs on CMPs
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Accelerating UNISIM-Based Cycle-Level Microarchitectural Simulations on Multicore Platforms
ACM Transactions on Design Automation of Electronic Systems (TODAES)
CRAW/P: a workload partition method for the efficient parallel simulation of manycores
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
ZSim: fast and accurate microarchitectural simulation of thousand-core systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
Optimizing parallel simulation of multicore systems using domain-specific knowledge
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Microprocessors & Microsystems
Hi-index | 0.00 |
Chip-multiprocessor (CMP) architectures present a challenge for efficient simulation, combining the requirements of a detailed microprocessor simulator with that of a tightly-coupled parallel system. In this paper, a distributed simulator for target CMPs is presented based on the Message Passing Interface (MPI) designed to run on a host cluster of workstations. Microbenchmark-based evaluation is used to narrow the parallelization design space concerning the performance impact of distributed vs. centralized target L2 simulation, blocking vs. non-blocking remote cache accesses, null-message vs. barrier techniques for clock synchronization, and network interconnect selection. The best combination is shown to yield speedups of up to 16 on a 9-node cluster of dual-CPU workstations, partially due to cache effects.