Floating-point multiply-add-fused with reduced latency

Authors:
T. Lang;J. D. Bruguera
Affiliations:
Dept. of Electr. Eng. & Comput. Sci., California Univ., Los Angeles, CA, USA;-
Venue:
IEEE Transactions on Computers
Year:
2004

Citing 0
Cited 5

Low-power leading-zero counting and anticipation logic for high-speed floating point units

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Parametric architecture for function calculation improvement

ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Fast, Efficient Floating-Point Adders and Multipliers for FPGAs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
John von Neumann's Analysis of Gaussian Elimination and the Origins of Modern Numerical Analysis

SIAM Review
Ultra-low-power adder stage design for exascale floating point units

ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers

Quantified Score

Hi-index	14.98

Visualization

Abstract

We propose architecture for the computation of the double-precision floating-point multiply-add-fused (MAP) operation A + (B × C). This architecture is based on the combined addition and rounding (using a dual adder) and in the anticipation of the normalization step before the addition. Because the normalization is performed before the addition, it is not possible to overlap the leading-zero-anticipator with the adder. Consequently, to avoid the increase in delay, we modify the design of the LZA so that the leading bits of its output are produced first and can be used to begin the normalization. Moreover, parts of the addition are also anticipated. We have estimated the delay of the resulting architecture considering the load introduced by long connections, and we estimate a delay reduction of between 15 percent and 20 percent, with respect to previous implementations.