Input data reuse in compiling window operations onto reconfigurable hardware
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Performance and Area Modeling of Complete FPGA Designs in the Presence of Loop Transformations
IEEE Transactions on Computers
RAT: RC Amenability Test for Rapid Performance Prediction
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms
Journal of Parallel and Distributed Computing
Performance tuning and analysis of future vector processors based on the roofline model
Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture
Performance Analysis Framework for High-Level Language Applications in Reconfigurable Computing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Designing Modular Hardware Accelerators in C with ROCCC 2.0
FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
Accurate Area, Time and Power Models for FPGA-Based Implementations
Journal of Signal Processing Systems
ISORCW '11 Proceedings of the 2011 14th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
GPURoofline: a model for guiding performance optimizations on GPUs
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Performance and toolchain of a combined GPU/FPGA desktop (abstract only)
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Performance modeling of pipelined linear algebra architectures on FPGAs
ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications
Hi-index | 0.00 |
The potential of FPGAs as accelerators for high-performance computing applications is very large, but many factors are involved in their performance. The design for FPGAs and the selection of the proper optimizations when mapping computations to FPGAs lead to prohibitively long developing time. Alternatives are the high-level synthesis (HLS) tools, which promise a fast design space exploration due to design at high-level or analytical performance models which provide realistic performance expectations, potential impediments to performance, and optimization guidelines. In this paper we propose the combination of both, in order to construct a performance model for FPGAs which is able to visually condense all the helpful information for the designer. Our proposed model extends the roofline model, by considering the resource consumption and the parameters used in the HLS tools, to maximize the performance and the resource utilization within the area of the FPGA. The proposed model is applied to optimize the design exploration of a class of window-based image processing applications using two different HLS tools. The results show the accuracy of the model as well as its flexibility to be combined with any HLS tool.