Division Algorithms and Implementations

Authors:
Stuart F. Oberman;Michael J. Flynn
Affiliations:
Stanford Univ., Stanford, CA;Stanford Univ., Stanford, CA
Venue:
IEEE Transactions on Computers
Year:
1997

Citing 26
Cited 32

On-the-fly conversion of redundant into conventional representations

IEEE Transactions on Computers
Computation of elementary functions on the IBM RISC System/6000 processor

IBM Journal of Research and Development
Fast Division Using Accurate Quotient Approximations to Reduce the Number of Iterations

IEEE Transactions on Computers - Special issue on computer arithmetic
Design Issues in Division and Other Floating-Point Operations

IEEE Transactions on Computers
Efficient Initial Approximation for Multiplicative Division and Square Root by a Multiplication with Operand Modification

IEEE Transactions on Computers
Design issues in high performance floating point arithmetic units

Design issues in high performance floating point arithmetic units
Division and Square Root: Digit-Recurrence Algorithms and Implementations

Division and Square Root: Digit-Recurrence Algorithms and Implementations
Performance Features of the PA7100 Microprocessor

IEEE Micro
Simple Radix-4 Division with Operands Scaling

IEEE Transactions on Computers
On-the-Fly Rounding (Computing Arithmetic)

IEEE Transactions on Computers
Reducing Iteration Time When Result Digit is Zero for Radix 2 SRT Division and Square Root with Redundant Remainders

IEEE Transactions on Computers
Accurate Rounding Scheme for the Newton-Raphson Method Using Redundant Binary Representation

IEEE Transactions on Computers
Over-Redundant Digit Sets and the Design of Digit-By-Digit Division Units

IEEE Transactions on Computers
Very-High Radix Division with Prescaling and Selection by Rounding

IEEE Transactions on Computers
High-Radix Division and Square-Root with Speculation

IEEE Transactions on Computers
Measuring the Accuracy of ROM Reciprocal Tables

IEEE Transactions on Computers
A Fast Radix-4 Division Algorithm and its Architecture

IEEE Transactions on Computers
Rounding for Quadratically Converging Algorithms for Division and Square Root

ASILOMAR '95 Proceedings of the 29th Asilomar Conference on Signals, Systems and Computers (2-Volume Set)
Efficient Initial Approximation and Fast Converging Methods for Division and Square Root

ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
167 MHz Radix-8 Divide and Square Root Using Overlapped Radix-2 Stages

ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
An Area/Performance Comparison of Subtractive and Multiplicative Divide/Square Root Implementations

ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
SRT Division Architectures and Implementations

ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
Faithful Interpolation in Reciprocal Tables

ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
Generating a Power of an Operand by a Table Look-up and a Multiplication

ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
Advanced performance features of the 64-bit PA-8000

COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
Internal architecture of Alpha 21164 microprocessor

COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference

Very High Radix Square Root with Prescaling and Rounding and a Combined Division/Square Root Unit

IEEE Transactions on Computers
Low power self-timed Radix-2 division (poster session)

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Fault-Tolerant Newton-Raphson and Goldschmidt Dividers Using Time Shared TMR

IEEE Transactions on Computers
Reciprocation, Square Root, Inverse Square Root, and Some Elementary Functions Using Small Multipliers

IEEE Transactions on Computers - Special issue on computer arithmetic
Boosting Very-High Radix Division with Prescaling and Selection by Rounding

IEEE Transactions on Computers
A flexible floating-point format for optimizing data-paths and operators in FPGA based DSPs

FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
A Radix-4 New Svobota-Tung Divider with Constant Timing Complexity for Prescaling

Journal of VLSI Signal Processing Systems
The Symmetric Table Addition Method for Accurate Function Approximation

Journal of VLSI Signal Processing Systems
Small Multiplier-Based Multiplication and Division Operators for Virtex-II Devices

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
An FPGA-Based Fan Beam Image Reconstruction Module

FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A New Divide and Conquer Method for Achieving High Speed Division in Hardware

ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Tight Upper Bounds on the Minimum Precision Required of the Divisor and the Partial Remainder in High-Radix Division

IEEE Transactions on Computers
Analysis of the impact of different methods for division/square root computation in the performance of a superscalar microprocessor

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Synthesis and verification
A Cost-Effective Pipelined Divider with a Small Lookup Table

IEEE Transactions on Computers
Hardware architecture and FPGA implementation of a type-2 fuzzy system

Proceedings of the 14th ACM Great Lakes symposium on VLSI
Field programmable gate arrays implementation of automated sensor self-validation system for cupola furnaces

Computers and Industrial Engineering
An iterative division algorithm for FPGAs

Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
Real-time arithmetic unit

Real-Time Systems
Fast decimal floating-point division

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Decimal Floating-Point Divider Using Newton---Raphson Iteration

Journal of VLSI Signal Processing Systems
Efficient implementation of constant coefficient division under quantization constraints

ICC'05 Proceedings of the 9th International Conference on Circuits
Area-efficient nonrestoring radix-2k division

Digital Signal Processing
An improved division algorithm with a small lookup table and its implementation

ASID'09 Proceedings of the 3rd international conference on Anti-Counterfeiting, security, and identification in communication
Reconfigurable custom floating-point instructions (abstract only)

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
A novel implementation of radix-4 floating-point division/square-root using comparison multiples

Computers and Electrical Engineering
Design of a Goldschmidt iterative divider for quantum-dot cellular automata

NANOARCH '09 Proceedings of the 2009 IEEE/ACM International Symposium on Nanoscale Architectures
Iterative-Gradient Based Complex Divider FPGA Core with Dynamic Configurability of Accuracy and Throughput

Journal of Signal Processing Systems
A goldschmidt division method with faster than quadratic convergence

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Minimizing the complexity of SRT tables

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Mathematical model of stored logic based computation

Mathematical and Computer Modelling: An International Journal
Hardware design and implementation of packet fair queuing algorithms for the quality of service support in the high-speed internet

Computer Networks: The International Journal of Computer and Telecommunications Networking
VLSI implementation of star detection and centroid calculation algorithms for star tracking applications

Journal of Real-Time Image Processing

Quantified Score

Hi-index	15.00

Visualization

Abstract

Many algorithms have been developed for implementing division in hardware. These algorithms differ in many aspects, including quotient convergence rate, fundamental hardware primitives, and mathematical formulations. This paper presents a taxonomy of division algorithms which classifies the algorithms based upon their hardware implementations and impact on system design. Division algorithms can be divided into five classes: digit recurrence, functional iteration, very high radix, table look-up, and variable latency. Many practical division algorithms are hybrids of several of these classes. These algorithms are explained and compared in this work. It is found that, for low-cost implementations where chip area must be minimized, digit recurrence algorithms are suitable. An implementation of division by functional iteration can provide the lowest latency for typical multiplier latencies. Variable latency algorithms show promise for simultaneously minimizing average latency while also minimizing area.