FPGA-Based Acceleration of the 3D Finite-Difference Time-Domain Method
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Designing Scalable FPGA-Based Reduction Circuits Using Pipelined Floating-Point Cores
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
High-Performance and Area-Efficient Reduction Circuits on FPGAs
SBAC-PAD '05 Proceedings of the 17th International Symposium on Computer Architecture on High Performance Computing
Advanced Components in the Variable Precision Floating-Point Library
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A novel linear array for discrete cosine transform
WSEAS Transactions on Circuits and Systems
Hi-index | 0.00 |
Floating-point reduction operations are a vital part of scientific computational kernels, such as vector dot-products, discrete cosine transforms (DCT), and matrix-matrix multiplications. As FPGAs continue to gain popularity in custom and embedded computing platforms, implementations of these applications in such platforms are desirable. Due to the inherently deep pipelines of high-performance floating-point cores in FPGAs, reduction circuits require special feedback and buffering schemes in order to realize full throughput. In this paper, we present our floating-point reduction architecture, clocked at more than 150 MHz targeting a Xilinx Virtex2 8000-4 FPGA.