Massively Parallel Logic Simulation with GPUs

Authors:
Yuhao Zhu;Bo Wang;Yangdong Deng
Affiliations:
Beihang University;Tsinghua University;Tsinghua University
Venue:
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Year:
2011

Citing 23
Cited 0

Virtual time

ACM Transactions on Programming Languages and Systems (TOPLAS)
Performance bounds on parallel self-initiating discrete-event simulations

ACM Transactions on Modeling and Computer Simulation (TOMACS)
An evaluation of the Chandy-Misra-Bryant algorithm for digital logic simulation

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on parallel and distributed systems performance
Parallel logic and fault simulation algorithms for shared memory vector machines

ICCAD '92 1992 IEEE/ACM international conference proceedings on Computer-aided design
Parallel logic simulation of VLSI systems

ACM Computing Surveys (CSUR)
System-on-a-chip verification: methodology and techniques

System-on-a-chip verification: methodology and techniques
Parallel and Distribution Simulation Systems

Parallel and Distribution Simulation Systems
SIMULATION OF PACKET COMMUNICATION ARCHITECTURE COMPUTER SYSTEMS

SIMULATION OF PACKET COMMUNICATION ARCHITECTURE COMPUTER SYSTEMS
Parallel algorithms for multiple processor architectures.

Parallel algorithms for multiple processor architectures.
Discrete-event Execution Alternatives on General Purpose Graphical Processing Units (GPGPUs)

Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation
Parallel and distributed simulation: traditional techniques and recent advances

Proceedings of the 38th conference on Winter simulation
GPU-Accelerated Evaluation Platform for High Fidelity Network Modeling

Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
Distributed Simulation: A Case Study in Design and Verification of Distributed Programs

IEEE Transactions on Software Engineering
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
NVIDIA Tesla: A Unified Graphics and Computing Architecture

IEEE Micro
Towards acceleration of fault simulation using graphics processing units

Proceedings of the 45th annual Design Automation Conference
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
A fast hybrid time-synchronous/event approach to parallel discrete event simulation of queuing networks

Proceedings of the 40th Conference on Winter Simulation
Event-driven gate-level simulation with GP-GPUs

Proceedings of the 46th Annual Design Automation Conference
Experiments with Single Core, Multi-core, and GPU Based Computation of Cellular Automata

SIMUL '09 Proceedings of the 2009 First International Conference on Advances in System Simulation
A GPU-Based Application Framework Supporting Fast Discrete-Event Simulation

Simulation
GCS: high-performance gate-level simulation with GP-GPUs

Proceedings of the Conference on Design, Automation and Test in Europe
SCGPSim: a fast SystemC simulator on GPUs

Proceedings of the 2010 Asia and South Pacific Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, we developed a massively parallel gate-level logical simulator to address the ever-increasing computing demand for VLSI verification. To the best of the authors’ knowledge, this work is the first one to leverage the power of modern GPUs to successfully unleash the massive parallelism of a conservative discrete event-driven algorithm, CMB algorithm. A novel data-parallel strategy is proposed to manipulate the fine-grain message passing mechanism required by the CMB protocol. To support robust and complete simulation for real VLSI designs, we establish both a memory paging mechanism and an adaptive issuing strategy to efficiently utilize the GPU memory with a limited capacity. A set of GPU architecture-specific optimizations are performed to further enhance the overall simulation performance. On average, our simulator outperforms a CPU baseline event-driven simulator by a factor of 47.4X. This work proves that the CMB algorithm can be efficiently and effectively deployed on modern GPUs without the performance overhead that had hindered its successful applications on previous parallel architectures.