Hardware synthesis from guarded atomic actions with performance specifications
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
High-Performance Reduction Circuits Using Deeply Pipelined Operators on FPGAs
IEEE Transactions on Parallel and Distributed Systems
FPGA Based High Performance Double-Precision Matrix Multiplication
VLSID '09 Proceedings of the 2009 22nd International Conference on VLSI Design
High-Level Synthesis: Past, Present, and Future
IEEE Design & Test
Lessons and Experiences with High-Level Synthesis
IEEE Design & Test
Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study
IEEE Transactions on Computers
High-level design and validation of the BlueSPARC multithreaded processor
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Accelerating Matrix Operations with Improved Deeply Pipelined Vector Reduction
IEEE Transactions on Parallel and Distributed Systems
Operation-centric hardware description and synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Measuring the Gap Between FPGAs and ASICs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
Field-Programmable-Gate-Arrays are used increasingly to speed up applications in various fields of science. But as modern digital designs integrate hundreds of interconnected processing and memory units, the need for a higher level of abstraction to handle their descriptions is indisputable. This paper presents a beyond-RTL concurrent hardware description language that combines both Finite-State Machine (FSM) and constraint programming paradigms. At the featured level of abstraction, the user describes dynamic connections between data sources and sinks that may not always be ready to send or receive data tokens. The high-level description methodology enables a comprehensible description of behaviors such as data transfer synchronization, exclusivity, priority and constrained scheduling by the means of logical-implication rules constraining the data transfers authorizations. Dynamically connecting resources with potential combinatorial dependencies may lead to instability or deadlock. Such situations are automatically detected and fixed by the proposed compiler that generates a dedicated control-circuit optimizing the number of transfers that can be authorized at each clock cycle. The proposed design automation methodology is applied to the problem of deeply-pipelined vector reduction. A pipelined floating point accumulator and a matrix multiplication circuits are described with a few lines of code and automatically compiled into an FPGA. Results show that the synthesis results are comparable to those obtained with hand-written RTL but with much lower effort and time.