Floating Point Division and Square Root Algorithms and Implementation in the AMD-K7 Microprocessor
ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
High-Performance Floating Point Divide
DSD '01 Proceedings of the Euromicro Symposium on Digital Systems Design
Computer Architecture, Fourth Edition: A Quantitative Approach
Computer Architecture, Fourth Edition: A Quantitative Approach
ARITH '07 Proceedings of the 18th IEEE Symposium on Computer Arithmetic
A Binary Multiplication Scheme Based on Squaring
IEEE Transactions on Computers
Journal of Signal Processing Systems
Hi-index | 0.00 |
Hardware support for floating-point (FP) arithmetic is a mandatory feature of modern microprocessor design. Although division and square root are relatively infrequent operations in traditional general-purpose applications, they are indispensable and becoming increasingly important in many modern applications. Therefore, overall performance can be greatly affected by the algorithms and the implementations used for designing FP-Div and FP-Sqrt units. In this paper, a single-precision fused floating-point multiply/divide/square root unit based on Taylor-series expansion algorithm is proposed. We extended an existing multiply/divide fused unit to incorporate the square root function with little area and latency overhead since Taylor's theorem enables us to compute approximations for many well-known functions with very similar forms. The implementation results of the proposed fused unit based on standard cell methodology in IBM 90nm technology exhibits that the incorporation of square root function to an existing multiply/divide unit requires only a modest 18% area increase and the same low latency for divide and square root operation can be achieved (12 cycles). The proposed arithmetic unit exhibits a reasonably good area-performance balance.