Pipelining of double precision floating point division and square root operations

Authors:
Anuja Jayraj Thakkar;Abdel Ejnioui
Affiliations:
University of Central Florida, Orlando, Florida;University of South Florida, Lakeland, Florida
Venue:
Proceedings of the 44th annual Southeast regional conference
Year:
2006

Citing 3
Cited 4

Challenges in CAD for the one million gate FPGA

FPGA '97 Proceedings of the 1997 ACM fifth international symposium on Field-programmable gate arrays
Implementation of single precision floating point square root on FPGAs

FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Tradeoffs of Designing Floating-Point Division and Square Root on Virtex FPGAs

FCCM '03 Proceedings of the 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

Modular array structure for non-restoring square root circuit

Journal of Systems Architecture: the EUROMICRO Journal
VFloat: A Variable Precision Fixed- and Floating-Point Library for Reconfigurable Hardware

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Novel Pipelined Architecture for Efficient Evaluation of the Square Root Using a Modified Non-Restoring Algorithm

Journal of Signal Processing Systems
High performance reconfigurable architecture for double precision floating point division

ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Space applications rely increasingly on high data rate DSP algorithms. These algorithms use double precision floating point arithmetic operations. While most DSP applications can be compiled on DSP processors, high data rate DSP computations require novel implementation technologies to support their high throughputs. Only recently, gate densities in FPGAs have reached a level which makes them attractive platforms to implement compute-intensive DSP applications. In this context, this paper presents the sequential and pipelined designs of a double precision floating point divider and square root unit on FPGAs. Contrary to pipelined parallel implementations, the pipelining of these units is based on unrolling the iterations in low-radix digit recurrence algorithms. These units are mapped on generic FPGA reconfigurable fabric without taking advantage of any advanced architectural components available in high capacity FPGAs. The implementations of these designs show that their performances are comparable to, and sometimes higher than, the performances of non-iterative designs based of high radix numbers. The iterative divider and square root unit occupy less than 1% of an XC2V6000 FPGA chip while their pipelined counterparts can produce throughputs that reach the 100 MFLOPS mark by consuming a modest 8% of the chip area.