FPGAs vs. CPUs: trends in peak floating-point performance
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
64-bit floating-point FPGA matrix multiplication
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Automatic application specific floating-point unit generation
Proceedings of the conference on Design, automation and test in Europe
Examining the viability of FPGA supercomputing
EURASIP Journal on Embedded Systems
International Journal of Parallel, Emergent and Distributed Systems
Multiplier-Based Double Precision Floating Point Divider According to the IEEE-754 Standard
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Floating-point divider design for FPGAs
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Parameterizable floating-point library for arithmetic operations in FPGAs
Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes
Custom floating-point unit generation for embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fast, Efficient Floating-Point Adders and Multipliers for FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Multipliers for floating-point double precision and beyond on FPGAs
ACM SIGARCH Computer Architecture News
EURASIP Journal on Embedded Systems
International Journal of Reconfigurable Computing - Special issue on High-Performance Reconfigurable Computing
Hi-index | 0.00 |
Most commercial and academic floating point librariesfor FPGAs provide only a small fraction of all possiblefloating point units. In contrast, the floating point unit generationapproach outlined in this paper allows for the creationof a vast collection of floating point units with differingthroughput, latency, and area characteristics. Givenperformance requirements, our generation tool automaticallychooses the proper implementation algorithm and architectureto create a compliant floating point unit. Ourapproach is fully integrated into standard C++ using ASC,a stream compiler for FPGAs, and the PAM-Blox II modulegeneration environment. The floating point units created byour approach exhibit a factor of two latency improvementversus commercial FPGA floating point units, while consumingonly half of the FPGA logic area.