Lava: hardware design in Haskell
ICFP '98 Proceedings of the third ACM SIGPLAN international conference on Functional programming
A Cooperative Algorithm for Stereo Matching and Occlusion Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms
SMBV '01 Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV'01)
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Shape and the Stereo Correspondence Problem
International Journal of Computer Vision
Adaptive Support-Weight Approach for Correspondence Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A code refinement methodology for performance-improved synthesis from C
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
C is for circuits: capturing FPGA circuits as sequential code for portability
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Cross-based local stereo matching using orthogonal integral images
IEEE Transactions on Circuits and Systems for Video Technology
DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Comparison Study on Implementing Optical Flow and Digital Communications on FPGAs and GPUs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A novel SoC architecture on FPGA for ultra fast face detection
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
FPGA Circuit Synthesis of Accelerator Data-Parallel Programs
FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
OpenRCL: Low-Power High-Performance Computing with Reconfigurable Devices
FPL '10 Proceedings of the 2010 International Conference on Field Programmable Logic and Applications
LegUp: high-level synthesis for FPGA-based processor/accelerator systems
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Real-time high-definition stereo matching on FPGA
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
A fast approximation of the bilateral filter using a signal processing approach
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Cost Aggregation and Occlusion Handling With WLS in Stereo Matching
IEEE Transactions on Image Processing
FPGA Design and Implementation of a Real-Time Stereo Vision System
IEEE Transactions on Circuits and Systems for Video Technology
Improving high level synthesis optimization opportunity through polyhedral transformations
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Journal of Real-Time Image Processing
Hi-index | 0.00 |
FPGAs are an attractive platform for applications with high computation demand and low energy consumption requirements. However, design effort for FPGA implementations remains high--often an order of magnitude larger than design effort using high-level languages. Instead of this time-consuming process, high-level synthesis (HLS) tools generate hardware implementations from algorithm descriptions in languages such as C/C++ and SystemC. Such tools reduce design effort: high-level descriptions are more compact and less error prone. HLS tools promise hardware development abstracted from software designer knowledge of the implementation platform. In this paper, we present an unbiased study of the performance, usability and productivity of HLS using AutoPilot (a state-of-the-art HLS tool). In particular, we first evaluate AutoPilot using the popular embedded benchmark kernels. Then, to evaluate the suitability of HLS on real-world applications, we perform a case study of stereo matching, an active area of computer vision research that uses techniques also common for image denoising, image retrieval, feature matching, and face recognition. Based on our study, we provide insights on current limitations of mapping general-purpose software to hardware using HLS and some future directions for HLS tool development. We also offer several guidelines for hardware-friendly software design. For popular embedded benchmark kernels, the designs produced by HLS achieve 4× to 126× speedup over the software version. The stereo matching algorithms achieve between 3.5× and 67.9× speedup over software (but still less than manual RTL design) with a fivefold reduction in design effort versus manual RTL design.