Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations
IEEE Transactions on Computers
Fast evaluation of elementary mathematical functions with correctly rounded last bit
ACM Transactions on Mathematical Software (TOMS)
Elementary functions: algorithms and implementation
Elementary functions: algorithms and implementation
Approximating Elementary Functions with Symmetric Bipartite Tables
IEEE Transactions on Computers
IEEE Transactions on Computers - Special issue on computer arithmetic
Tuning the Pentium Pro Microarchitecture
IEEE Micro
Fast Hardware-Based Algorithms for Elementary Function Computations Using Rectangular Multipliers
IEEE Transactions on Computers
BKM: A New Hardware Algorithm for Complex Elementary Functions
IEEE Transactions on Computers
On Hardware for Computing Exponential and Trigonometric Functions
IEEE Transactions on Computers
High Radix Cordic Rotation Based on Selection by Rounding
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
High-speed double precision computation of nonlinear functions
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
167 MHz Radix-4 Floating Point Multiplier
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
The K5 transcendental functions
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
Floating-Point Unit in Standard Cell Design with 116 Bit Wide Dataflow
ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
New Algorithms for Improved Transcendental Functions on IA-64
ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
1-GHz HAL SPARC64® Dual Floating Point Unit with RAS Features
ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Floating-Point Exponentiation Units for Reconfigurable Computing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 0.00 |
In this work we present an implementation of the exponential function in double precision, in a unit that supports IEEE floating-point arithmetic. As existing proposals, the implementation is based on the use of a floating-point multiplier and additional hardware. We decompose the computation into three subexponentials. The first and third subexponentials are computed in a conventional way (table look-up and polynomial approximation). The second subexponential is computed based on a transformation of the slow radix-2 digit-recurrence algorithm into a fast computation by using the multiplier and additional hardware. We present a design process that permits the selection of the most convenient trade-off between hardware complexity and latency. We discuss the algorithm, the implementation, and perform a rough comparison with three proposed designs. Our estimations indicate that the implementation proposed in this work presents better trade-off between hardware complexity and latency than the compared designs.