Hardware Designs for Exactly Rounded Elementary Functions

Authors:
M. J. Schulte;E. E. Swartzlander, Jr.
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
1994

Citing 10
Cited 18

Numerical methods for computer science, engineering, and mathematics

Numerical methods for computer science, engineering, and mathematics
An Interpolating Memory Unit for Function Evaluation: Analysis and Design

IEEE Transactions on Computers
Computation of elementary functions on the IBM RISC System/6000 processor

IBM Journal of Research and Development
Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations

IEEE Transactions on Computers
An accurate elementary mathematical library for the IEEE floating point standard

ACM Transactions on Mathematical Software (TOMS)
What every computer scientist should know about floating-point arithmetic

ACM Computing Surveys (CSUR)
Fast evaluation of elementary mathematical functions with correctly rounded last bit

ACM Transactions on Mathematical Software (TOMS)
The Need for an Industry Standard of Accuracy for Elementary-Function Programs

ACM Transactions on Mathematical Software (TOMS)
Arithmetic Error Analysis of a new Reciprocal Cell

ICCD '92 Proceedings of the 1991 IEEE International Conference on Computer Design on VLSI in Computer & Processors
Software Manual for the Elementary Functions (Prentice-Hall series in computational mathematics)

Software Manual for the Elementary Functions (Prentice-Hall series in computational mathematics)

Toward Correctly Rounded Transcendentals

IEEE Transactions on Computers
Approximating Elementary Functions with Symmetric Bipartite Tables

IEEE Transactions on Computers
The Symmetric Table Addition Method for Accurate Function Approximation

Journal of VLSI Signal Processing Systems
Accurate Function Approximations by Symmetric Table Lookup and Addition

ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
VLSI Implementation of a Low-Power Antilogarithmic Converter

IEEE Transactions on Computers
CMOS VLSI Implementation of a Low-Power Logarithmic Converter

IEEE Transactions on Computers
High-Speed Function Approximation Using a Minimax Quadratic Interpolator

IEEE Transactions on Computers
Automating custom-precision function evaluation for embedded processors

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Optimizing Hardware Function Evaluation

IEEE Transactions on Computers
Reciprocal and Reciprocal Square Root Units with Operand Modification and Multiplication

Journal of VLSI Signal Processing Systems
A Hardware Gaussian Noise Generator Using the Box-Muller Method and Its Error Analysis

IEEE Transactions on Computers
Numerical Function Generators Using LUT Cascades

IEEE Transactions on Computers
Modular design and implementation of FPGA-based tap-selective maximum-likelihood channel estimator

WSEAS Transactions on Signal Processing
Design Method for Numerical Function Generators Using Recursive Segmentation and EVBDDs

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Low-Power FPGA-Implementation of atan(Y/X) Using Look-Up Table Methods for Communication Applications

Journal of Signal Processing Systems
High-performance special function unit for programmable 3-D graphics processors

IEEE Transactions on Circuits and Systems Part I: Regular Papers
Mesa: automatic generation of lookup table optimizations

Proceedings of the 4th International Workshop on Multicore Software Engineering
Multi-Gb/s LDPC code design and implementation

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	15.01

Visualization

Abstract

This paper presents hardware designs that produce exactly rounded results for the functions of reciprocal, square-root, 2/sup x/, and log/sub 2/(x). These designs use polynomial approximation in which the terms in the approximation are generated in parallel, and then summed by using a multi-operand adder. To reduce the number of terms in the approximation, the input interval is partitioned into subintervals of equal size, and different coefficients are used for each subinterval. The coefficients used in the approximation are initially determined based on the Chebyshev series approximation. They are then adjusted to obtain exactly rounded results for all inputs. Hardware designs are presented, and delay and area comparisons are made based on the degree of the approximating polynomial and the accuracy of the final result. For single-precision floating point numbers, a design that produces exactly rounded results for all four functions has an estimated delay of 80 ns and a total chip area of 98 mm/sup 2/ in a 1.0-micron CMOS technology. Allowing the results to have a maximum error of one unit in the last place reduces the computational delay by 5% to 30% and the area requirements by 33% to 77%.