Temporal Partitioning and Scheduling Data Flow Graphs for Reconfigurable Computers
IEEE Transactions on Computers
Optimal temporal partitioning and synthesis for reconfigurable architectures
Proceedings of the conference on Design, automation and test in Europe
Synthesis of reconfigurable high-performance multicore systems
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
3D finite difference computation on GPUs using CUDA
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Optimization of imprecise circuits represented by Taylor series and real-valued polynomials
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Automatically mapping applications to a self-reconfiguring platform
Proceedings of the Conference on Design, Automation and Test in Europe
Assessing Accelerator-Based HPC Reverse Time Migration
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A scalable approach for automated precision analysis
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Hi-index | 0.00 |
A design approach is proposed to automatically identify and exploit run-time reconfiguration opportunities while optimising resource utilisation. We introduce Configuration Data Flow Graph, a hierarchical graph structure enabling reconfigurable designs to be synthesised in three steps: function analysis, configuration organisation, and run-time solution generation. Three applications, based on barrier option pricing, particle filter, and reverse time migration are used in evaluating the proposed approach. The run-time solutions approximate the theoretical performance by eliminating idle functions, and are 1.61 to 2.19 times faster than optimised static designs. FPGA designs developed with the proposed approach are up to 28.8 times faster than optimised CPU reference designs and 1.55 times faster than optimised GPU designs.