Scheduling designs into a time-multiplexed FPGA
FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Ravel: assigned-delay compiled-code logic simulation
ICCAD '92 Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design
Runtime and quality tradeoffs in FPGA placement and routing
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
RT-Level ITC'99 Benchmarks and First ATPG Results
IEEE Design & Test
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Re-use-centric architecture for a fully accelerated testbench environment
Proceedings of the 40th annual Design Automation Conference
ISVLSI '05 Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design
Automatic translation of behavioral testbench for fully accelerated simulation
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Using conjugate symmetries to enhance gate-level simulations
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Exploiting process locality of reference in RTL simulation acceleration
EURASIP Journal on Embedded Systems - Reconfigurable Computing and Hardware/Software Codesign
Formal Methods in System Design
Hi-index | 0.00 |
We introduce a novel approach to accelerating functional simulation. The key attributes of our approach are high-performance, low-cost, scalability and low turn-around-time (TAT). We achieve speedups between 25 and 2000x over zero delay event-driven simulation and between 75 and 1000x over cycle-based simulation on benchmark and industrial circuits while maintaining the cost, scalability and TAT advantages of simulation. Owing to these attributes, we believe that such an approach has potential for very wide deployment as replacement or enhancement for existing simulators. Our technology relies on a VLIW-like virtual simulation processor (SimPLE) mapped to a single FPGA on an off-the-shelf PCI board. Primarily responsible for the speed are (i) parallelism in the processor architecture (ii) high pin count on the FPGA enabling large instruction bandwidth and (iii) high speed (124 MHz on Xilinx Virtex-II) single-FPGA implementation of the processor with regularity driven efficient place and route. Companion to the processor is the very fast SimPLE compiler which achieves compilation rates of 4 million gates/hour. In order to simulate the netlist, the compiled instructions are streamed through the FPGA, along with the simulation vectors. This architecture plugs in naturally into any existing HDL simulation environment. We have a working prototype based on a commercially available PCI-based FPGA board.