Reducing the cost of floating-point mantissa alignment and normalization in FPGAs

Authors:
Yehdhih Ould Mohammed Moctar;Nithin George;Hadi Parandeh-Afshar;Paolo Ienne;Guy G.F. Lemieux;Philip Brisk
Affiliations:
University of California, Riverside, Riverside, CA, USA;Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland;Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland;Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland;University of British Columbia, Vancouver, BC, Canada;University of California, Riverside, Riverside, CA, USA
Venue:
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Year:
2012

Citing 17
Cited 2

PathFinder: a negotiation-based performance-driven router for FPGAs

FPGA '95 Proceedings of the 1995 ACM third international symposium on Field-programmable gate arrays
Using cluster-based logic blocks and timing-driven packing to improve FPGA speed and density

FPGA '99 Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays
Automatic generation of FPGA routing architectures from high-level descriptions

FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
Timing-driven placement for FPGAs

FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
Using sparse crossbars within LUT

FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Multiplexer restructuring for FPGA implementation cost reduction

Proceedings of the 42nd annual Design Automation Conference
Designing Efficient Input Interconnect Blocks for LUT Clusters Using Counting and Entropy

ACM Transactions on Reconfigurable Technology and Systems (TRETS) - Special edition on the 15th international symposium on FPGAs
Area and delay trade-offs in the circuit and architecture design of FPGAs

Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Architectural modifications to enhance the floating-point performance of FPGAs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Automated transistor sizing for FPGA architecture exploration

Proceedings of the 45th annual Design Automation Conference
VPR 5.0: FPGA cad and architecture exploration tools with single-driver routing, heterogeneity and process scaling

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Flexible multi-mode embedded floating-point unit for field programmable gate arrays

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
FPGA Floating Point Datapath Compiler

FCCM '09 Proceedings of the 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines
Floating-point FPGA: architecture and modeling

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Synthesis of Floating-Point Addition Clusters on FPGAs Using Carry-Save Arithmetic

FPL '10 Proceedings of the 2010 International Conference on Field Programmable Logic and Applications
Enhancing the area efficiency of FPGAs with hard circuits using shadow clusters

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The effect of LUT and cluster size on deep-submicron FPGA performance and density

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)

Towards simulator-like observability for FPGAs: a virtual overlay network for trace-buffers

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Accelerating FPGA debug: Increasing visibility using a runtime reconfigurable observation and triggering network

ACM Transactions on Design Automation of Electronic Systems (TODAES)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In floating-point datapaths synthesized on FPGAs, the shifters that perform mantissa alignment and normalization consume a disproportionate number of LUTs. Shifters are implemented using several rows of small multiplexers; unfortunately, multiplexer-based logic structures map poorly onto LUTs. FPGAs, meanwhile, contain a large number of multiplexers in the programmable routing network; these multiplexer are placed under static control of the FPGA's configuration bitstream. In this work, we modify some of the routing multiplexers in the intra-cluster routing network of a CLB in an FPGA to implement shifters for floating-point mantissa alignment and normalization; the number of CLBs required for these operations is reduced by 67%. If shifting is not required, the routing multiplexers that have been modified can be configured to operate as normal routing multiplexers, so no functionality is sacrificed. The area overhead incurred by these modifications is small, and there is no need to modify every routing multiplexer in the FPGA. Experiments show that there is no negative impact in terms of clock frequency or routability for benchmarks that do not use the dynamic multiplexers.