Design Space Pruning Through Early Estimations of Area/Delay Tradeoffs for FPGA Implementations

  • Authors:
  • S. Bilavarn;G. Gogniat;J. -L. Philippe;L. Bossuet

  • Affiliations:
  • Signal Process. Inst., Ecole Polytech. Fed. de Lausanne;-;-;-

  • Venue:
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

Early performance feedback and design space exploration of complete field-programmable gate array (FPGA) designs are still time consuming tasks. This paper proposes an original methodology based on estimations to reduce the impact on design time. It promotes a hierarchical exploration to mitigate the complexity of the exploration process. Therefore, this work takes place before any design step, such as compilation or behavioral synthesis, where the specification is still provided as a C program. The goal is to provide early area and delay evaluations of many register-transfer level (RTL) implementations to prune the design space. Two main steps compose the flow: 1) a structural exploration step defines several RTL implementations, and 2) a physical mapping estimation step computes the mapping characteristics of these onto a given FPGA device. For the structural exploration, a simple yet realistic RTL model reduces the complexity and permits a fast definition of solutions. At this stage, it focuses on the computation parallelism and memory bandwidth. Advanced optimizations using for instance loop tiling, scalar replacement, or data layout are not considered. For the physical estimations, an analytical approach is used to provide fast and accurate area/delay tradeoffs. The paper also do not consider the impact of routing on critical paths or other optimizations. The reduction of the complexity allows the evaluation of key design alternatives, namely target device and parallelism that can also include the effect of resource allocation, bitwidth, or clock period. Due to this, a designer can quickly identify a reliable subset of solutions for which further refinement can be applied to enhance the relevance of the final architecture and reach a better use of FPGA resources, i.e., an optimal level of performance. Experiments performed with Xilinx (VirtexE) and Altera (Apex20K) FPGAs for a two-dimensional Discrete Wavelet Transform and a G722 speech coder lead to an average- - error of 10% for temporal values and 18% for area estimations