Generation of a Precise Binary Logarithm with Difference Grouping Programmable Logic Array
IEEE Transactions on Computers
A Systematic Method for Division with High Average Bit Skipping
IEEE Transactions on Computers
Table-driven implementation of the logarithm function in IEEE floating-point arithmetic
ACM Transactions on Mathematical Software (TOMS)
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Cost-efficient high-radix division
Journal of VLSI Signal Processing Systems - Special issue: computer arithmetic
Fast Division Using Accurate Quotient Approximations to Reduce the Number of Iterations
IEEE Transactions on Computers - Special issue on computer arithmetic
High-radix algorithms for high-order arithmetic operations
High-radix algorithms for high-order arithmetic operations
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
Computer Arithmetic: Principles, Architecture and Design
Computer Arithmetic: Principles, Architecture and Design
Error Coding for Arithmetic Processors
Error Coding for Arithmetic Processors
Combinatorial Algorithms: For Computers and Hard Calculators
Combinatorial Algorithms: For Computers and Hard Calculators
Some Results on a SRT Type Division Scheme
IEEE Transactions on Computers
Approximations for Digital Computers
Approximations for Digital Computers
Hardware Starting Approximation Method and Its Application to the Square Root Operation
IEEE Transactions on Computers
Technology Trends and Adaptive Computing
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
FPGA-based System for Real-Time Video Texture Analysis
Journal of Signal Processing Systems
Journal of Signal Processing Systems
Multi-Gb/s LDPC code design and implementation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
A fundamental parallel procedure of implementing certain algorithms is by means of trees and arrays, [1]. A method of generating any function defined by a power series in a fast, efficient parallel-acting manner using trees and arrays is described. The power series considered can be written as f(Y) = a0 + a1Y + a2Y2 + ... where Y = v1x + v2x2 + ... + vkxk, vi = (0, 1), is a binary fraction when x = 陆. The power series must be expanded into individual terms cxi. These terms are then transformed into weighted binary terms. Two methods are given to obtain all the individual terms (including coefficients) associated with each power of x. The hardware required for implementation is a tree similar to a Wallace or Dadda tree used for parallel multiplication of two binary numbers. Despite the multiplicity of terms required, Boolean logic methods reduce the tree dimensions in many cases so that the total tree required is smaller than an existing multiplier tree. In that case, Schwarz and Flynn, [13], [15], have shown that the required tree can be superimposed on the existing multiplier tree in a multiplexed manner with relatively little increase in hardware. The generation of the logarithmic function is described in detail. Comparisons with other methods are made for the case of 11 bit accuracy of the logarithm. Using a figure of merit of latency times area (number of transistors), estimates show that the superposition scheme gives the best (smallest) figure of merit. For 11 bit accuracy, the superposition scheme requires only about 480 additional gates to be superimposed upon a 41 bit or larger multiplier, and the speed of operation is that of the multiplier.