Numerical methods for computer science, engineering, and mathematics
Numerical methods for computer science, engineering, and mathematics
An Interpolating Memory Unit for Function Evaluation: Analysis and Design
IEEE Transactions on Computers
Computation of elementary functions on the IBM RISC System/6000 processor
IBM Journal of Research and Development
Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations
IEEE Transactions on Computers
An accurate elementary mathematical library for the IEEE floating point standard
ACM Transactions on Mathematical Software (TOMS)
What every computer scientist should know about floating-point arithmetic
ACM Computing Surveys (CSUR)
Fast evaluation of elementary mathematical functions with correctly rounded last bit
ACM Transactions on Mathematical Software (TOMS)
The Need for an Industry Standard of Accuracy for Elementary-Function Programs
ACM Transactions on Mathematical Software (TOMS)
Arithmetic Error Analysis of a new Reciprocal Cell
ICCD '92 Proceedings of the 1991 IEEE International Conference on Computer Design on VLSI in Computer & Processors
Software Manual for the Elementary Functions (Prentice-Hall series in computational mathematics)
Software Manual for the Elementary Functions (Prentice-Hall series in computational mathematics)
Toward Correctly Rounded Transcendentals
IEEE Transactions on Computers
Approximating Elementary Functions with Symmetric Bipartite Tables
IEEE Transactions on Computers
The Symmetric Table Addition Method for Accurate Function Approximation
Journal of VLSI Signal Processing Systems
Accurate Function Approximations by Symmetric Table Lookup and Addition
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
VLSI Implementation of a Low-Power Antilogarithmic Converter
IEEE Transactions on Computers
CMOS VLSI Implementation of a Low-Power Logarithmic Converter
IEEE Transactions on Computers
High-Speed Function Approximation Using a Minimax Quadratic Interpolator
IEEE Transactions on Computers
Automating custom-precision function evaluation for embedded processors
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Optimizing Hardware Function Evaluation
IEEE Transactions on Computers
Reciprocal and Reciprocal Square Root Units with Operand Modification and Multiplication
Journal of VLSI Signal Processing Systems
A Hardware Gaussian Noise Generator Using the Box-Muller Method and Its Error Analysis
IEEE Transactions on Computers
Numerical Function Generators Using LUT Cascades
IEEE Transactions on Computers
Modular design and implementation of FPGA-based tap-selective maximum-likelihood channel estimator
WSEAS Transactions on Signal Processing
Design Method for Numerical Function Generators Using Recursive Segmentation and EVBDDs
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Journal of Signal Processing Systems
High-performance special function unit for programmable 3-D graphics processors
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Mesa: automatic generation of lookup table optimizations
Proceedings of the 4th International Workshop on Multicore Software Engineering
Multi-Gb/s LDPC code design and implementation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 15.01 |
This paper presents hardware designs that produce exactly rounded results for the functions of reciprocal, square-root, 2/sup x/, and log/sub 2/(x). These designs use polynomial approximation in which the terms in the approximation are generated in parallel, and then summed by using a multi-operand adder. To reduce the number of terms in the approximation, the input interval is partitioned into subintervals of equal size, and different coefficients are used for each subinterval. The coefficients used in the approximation are initially determined based on the Chebyshev series approximation. They are then adjusted to obtain exactly rounded results for all inputs. Hardware designs are presented, and delay and area comparisons are made based on the degree of the approximating polynomial and the accuracy of the final result. For single-precision floating point numbers, a design that produces exactly rounded results for all four functions has an estimated delay of 80 ns and a total chip area of 98 mm/sup 2/ in a 1.0-micron CMOS technology. Allowing the results to have a maximum error of one unit in the last place reduces the computational delay by 5% to 30% and the area requirements by 33% to 77%.