Optimizing floating point units in hybrid FPGAs

Authors:
ChiWai Yu;Alastair M. Smith;Wayne Luk;Philip H. W. Leong;Steven J. E. Wilton
Affiliations:
Department of Computing, Imperial College London, London, UK;Department of Electrical and Electronic Engineering, Imperial College London, UK;Department of Computing, Imperial College London, London, UK;School of Electrical and Information Engineering, University of Sydney, Sydney, NSW, Australia;School of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2012

Citing 11
Cited 1

Architecture and CAD for Deep-Submicron FPGAs

Architecture and CAD for Deep-Submicron FPGAs
A synthesizable datapath-oriented embedded FPGA fabric

Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
Architectural modifications to enhance the floating-point performance of FPGAs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Architectural enhancements in Stratix-III™ and Stratix-IV™

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Flexible multi-mode embedded floating-point unit for field programmable gate arrays

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Cholesky decomposition using fused datapath synthesis

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
FPGA Floating Point Datapath Compiler

FCCM '09 Proceedings of the 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines
Floating-point FPGA: architecture and modeling

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Bridge floating-point fused multiply-add design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Measuring the Gap Between FPGAs and ASICs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

The VTR project: architecture and CAD for FPGAs from verilog to routing

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a methodology to optimize coarse-grained floating point units (FPUs) in a hybrid field-programmable gate array (FPGA), where the FPU consists of a number of interconnected floating point adders/subtracters (FAs), multipliers (FMs), and wordblocks (WBs). The wordblocks include registers and lookup tables (LUTs) which can implement fixed point operations efficiently. We employ common subgraph extraction to determine the best mix of blocks within an FPU and study the area, speed and utilization tradeoff over a set of floating point benchmark circuits. We then explore the system impact of FPU density and flexibility in terms of area, speed, and routing resources. Finally, we derive an optimized coarse-grained FPU by considering both architectural and system-level issues. This proposed methodology can be used to evaluate a variety of FPU architecture optimizations. The results for the selected FPU architecture optimization show that although high density FPUs are slower, they have the advantages of improved area, area-delay product, and throughput.