Area and performance tradeoffs in floating-point divide and square-root implementations
ACM Computing Surveys (CSUR)
IEEE Transactions on Computers
A Library of Parameterized Floating-Point Modules and Their Use
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
A CAD Suite for High-Performance FPGA Design
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
On the Design of Fast IEEE Floating-Point Adders
ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Tradeoffs of Designing Floating-Point Division and Square Root on Virtex FPGAs
FCCM '03 Proceedings of the 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Floating Point Unit Generation and Evaluation for FPGAs
FCCM '03 Proceedings of the 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
FPGAs vs. CPUs: trends in peak floating-point performance
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Sparse Matrix-Vector multiplication on FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Floating-point sparse matrix-vector multiply for FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
An Analysis of the Double-Precision Floating-Point FFT on FPGAs
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
High performance reconfigurable architecture for double precision floating point division
ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Hi-index | 0.00 |
Growth in floating-point applications for field-programmable gate arrays (FPGAs) has made it critical to optimize floating-point units for FPGA technology. The divider is of particular interest because the design space is large and divider usage in applications varies widely. Obtaining the right balance between clock speed, latency, throughput, and area in FPGAs can be challenging. The designs presented here cover a range of performance, throughput, and area constraints. On a Xilinx Virtex4-11 FPGA, the range includes 250-MHz IEEE compliant double precision divides that are fully pipelined to 187-MHz iterative cores. Similarly, area requirements range from 4100 slices down to a mere 334 slices.