An Analysis of the Double-Precision Floating-Point FFT on FPGAs

Authors:
K. Scott Hemmert;Keith D. Underwood
Affiliations:
Sandia National Laboratories;Sandia National Laboratories
Venue:
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Year:
2005

Citing 0
Cited 13

Embedded floating-point units in FPGAs

Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
Preliminary investigation of advanced electrostatics in molecular dynamics on reconfigurable computers

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Architectures and APIs: assessing requirements for delivering FPGA performance to applications

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Using FPGA Devices to Accelerate Biomolecular Simulations

Computer
Architectural modifications to enhance the floating-point performance of FPGAs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Floating-point divider design for FPGAs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
State-of-the-art in heterogeneous computing

Scientific Programming
Fast, Efficient Floating-Point Adders and Multipliers for FPGAs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
FPGA-Array with Bandwidth-Reduction Mechanism for Scalable and Power-Efficient Numerical Simulations Based on Finite Difference Methods

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Accelerating Machine-Learning Algorithms on FPGAs using Pattern-Based Decomposition

Journal of Signal Processing Systems
An evaluation of an integrated on-chip/off-chip network for high-performance reconfigurable computing

International Journal of Reconfigurable Computing - Special issue on High-Performance Reconfigurable Computing
A fast poisson solver for hybrid reconfigurable system

ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Advances in FPGA technology have led to dramatic improvements in double precision floating-point performance. Modern FPGAs boast several GigaFLOPs of raw computing power. Unfortunately, this computing power is distributed across 30 floating-point units with over 10 cycles of latency each. The user must find two orders of magnitude more parallelism than is typically exploited in a single microprocessor; thus, it is not clear that the computational power of FPGAs can be exploited across a wide range of algorithms. This paper explores three implementation alternatives for the Fast Fourier Transform (FFT) on FPGAs. The algorithms are compared in terms of sustained performance and memory requirements for various FFT sizes and FPGA sizes. The results indicate that FPGAs are competitive with microprocessors in terms of performance and that the "correct" FFT implementation varies based on the size of the transform and the size of the FPGA.