What every computer scientist should know about floating-point arithmetic
ACM Computing Surveys (CSUR)
FPGA-Based Implementation of a Robust IEEE-754 Exponential Unit
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Parameterized floating-point logarithm and exponential functions for FPGAs
Microprocessors & Microsystems
Highly Efficient Structure of 64-Bit Exponential Function Implemented in FPGAs
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
A unified algorithm for elementary functions
AFIPS '71 (Spring) Proceedings of the May 18-20, 1971, spring joint computer conference
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Exploring FPGAs for accelerating the phylogenetic likelihood function
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Hi-index | 0.00 |
The large number of available DSP slices on new-generation FPGAs allows for efficient mapping and acceleration of floating-point intensive codes. Numerous scientific codes heavily rely on executing the exponential function. To this end, we present the design and implementation of a pipelined CORDIC/TD-based (COrdinate Rotation DIgital Computer/Table Driven) Exponential Approximation Unit (EAU) that will be made freely available for download (including the hardware description). The EAU supports single and double precision arithmetics and we provide appropriate configurations for Virtex2, Virtex4, and Virtex5 FPGAs. The architecture has been verified via simulations and by testing on a real FPGA. The implementation achieves the highest clock frequency reported in literature to date. Moreover, the EAU only occupies 5% of hardware resources on a medium-size FPGA such as the Virtex 5 SX95T. In addition, a general framework for safely conducting application-specific optimizations of floating-point operators on FPGAs is presented. We apply this framework to a bioinformatics application and optimize the EAU architecture using width-reduced floating-point operators and application-specific performance tuning. The optimized application-specific EAU occupies approximately 70% less hardware resources than the initial single precision implementation.