Switch-level simulation of VLSI using a special-purpose data-driven computer
DAC '85 Proceedings of the 22nd ACM/IEEE Design Automation Conference
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
The design and implementation of a VLSI chess move generator
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A data-driven multiprocessor for switch-level simulation of vlsi circuits
A data-driven multiprocessor for switch-level simulation of vlsi circuits
Statistics for parallelism and abstraction level in digital simulation
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
Massively parallel switch-level simulation: a feasibility study
DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
An evaluation of the Chandy-Misra-Bryant algorithm for digital logic simulation
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on parallel and distributed systems performance
Parallel logic simulation of VLSI systems
ACM Computing Surveys (CSUR)
Potential performance of parallel conservative simulation of VLSI circuits and systems
ANSS '92 Proceedings of the 25th annual symposium on Simulation
GCS: high-performance gate-level simulation with GP-GPUs
Proceedings of the Conference on Design, Automation and Test in Europe
Gate-Level Simulation with GPU Computing
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hi-index | 0.00 |
The parallelism inherent in actual circuits suggests that this parallelism might be exploited in a switch-level simulation machine, in order to reduce total simulation time. This paper explores the extent to which this parallelism exists and the extent to which it can be exploited. The exploration is done in the context of a proposed multiprocessor simulation machine called the Fast-1. The Fast-1 is a form of data-flow machine in which switch-level circuits are represented as programs consisting of transistor and node instructions. In a multiprocessor Fast-1 these programs are partitioned onto one or more processors. Using a simulation of the Fast-1, experiments were performed using thirteen circuits, ranging in size from 78 to 20233 transistors. The most parallel circuit measured in these experiments potentially could be simulated almost 200 times faster on a multiprocessor than on a uniprocessor, assuming one instruction per processor and no-cost interprocessor communication. Using 64 processors, an actual speedup of 28 was achieved using contention-free interconnect, while a speedup of 12 was achieved when the 64 processors were connected by a broadcast bus for which they had to arbitrate.