Floating-point division and square root using a Taylor-series expansion algorithm

Authors:
Taek-Jun Kwon;Jeffrey Draper
Affiliations:
University of Southern California/Information Sciences Institute, 4676 Admiralty Way STE 1001, Marina Del Rey, CA 90292, USA;University of Southern California/Information Sciences Institute, 4676 Admiralty Way STE 1001, Marina Del Rey, CA 90292, USA
Venue:
Microelectronics Journal
Year:
2009

Citing 6
Cited 1

Division and Square Root: Choosing the Right Implementation

IEEE Micro
Floating Point Division and Square Root Algorithms and Implementation in the AMD-K7 Microprocessor

ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
High-Performance Floating Point Divide

DSD '01 Proceedings of the Euromicro Symposium on Digital Systems Design
Computer Architecture, Fourth Edition: A Quantitative Approach

Computer Architecture, Fourth Edition: A Quantitative Approach
P6 Binary Floating-Point Unit

ARITH '07 Proceedings of the 18th IEEE Symposium on Computer Arithmetic
A Binary Multiplication Scheme Based on Squaring

IEEE Transactions on Computers

Novel Pipelined Architecture for Efficient Evaluation of the Square Root Using a Modified Non-Restoring Algorithm

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hardware support for floating-point (FP) arithmetic is a mandatory feature of modern microprocessor design. Although division and square root are relatively infrequent operations in traditional general-purpose applications, they are indispensable and becoming increasingly important in many modern applications. Therefore, overall performance can be greatly affected by the algorithms and the implementations used for designing FP-Div and FP-Sqrt units. In this paper, a single-precision fused floating-point multiply/divide/square root unit based on Taylor-series expansion algorithm is proposed. We extended an existing multiply/divide fused unit to incorporate the square root function with little area and latency overhead since Taylor's theorem enables us to compute approximations for many well-known functions with very similar forms. The implementation results of the proposed fused unit based on standard cell methodology in IBM 90nm technology exhibits that the incorporation of square root function to an existing multiply/divide unit requires only a modest 18% area increase and the same low latency for divide and square root operation can be achieved (12 cycles). The proposed arithmetic unit exhibits a reasonably good area-performance balance.