High-level synthesis: productivity, performance, and software constraints

  • Authors:
  • Yun Liang;Kyle Rupnow;Yinan Li;Dongbo Min;Minh N. Do;Deming Chen

  • Affiliations:
  • Advanced Digital Sciences Center, Singapore;Advanced Digital Sciences Center, Singapore;Advanced Digital Sciences Center, Singapore;Advanced Digital Sciences Center, Singapore;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • Journal of Electrical and Computer Engineering - Special issue on ESL Design Methodology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

FPGAs are an attractive platform for applications with high computation demand and low energy consumption requirements. However, design effort for FPGA implementations remains high--often an order of magnitude larger than design effort using high-level languages. Instead of this time-consuming process, high-level synthesis (HLS) tools generate hardware implementations from algorithm descriptions in languages such as C/C++ and SystemC. Such tools reduce design effort: high-level descriptions are more compact and less error prone. HLS tools promise hardware development abstracted from software designer knowledge of the implementation platform. In this paper, we present an unbiased study of the performance, usability and productivity of HLS using AutoPilot (a state-of-the-art HLS tool). In particular, we first evaluate AutoPilot using the popular embedded benchmark kernels. Then, to evaluate the suitability of HLS on real-world applications, we perform a case study of stereo matching, an active area of computer vision research that uses techniques also common for image denoising, image retrieval, feature matching, and face recognition. Based on our study, we provide insights on current limitations of mapping general-purpose software to hardware using HLS and some future directions for HLS tool development. We also offer several guidelines for hardware-friendly software design. For popular embedded benchmark kernels, the designs produced by HLS achieve 4× to 126× speedup over the software version. The stereo matching algorithms achieve between 3.5× and 67.9× speedup over software (but still less than manual RTL design) with a fivefold reduction in design effort versus manual RTL design.