Floating-Point Fused Multiply-Add: Reduced Latency for Floating-Point Addition

Authors:
Javier D. Bruguera
Affiliations:
University of Santiago de Compostela
Venue:
ARITH '05 Proceedings of the 17th IEEE Symposium on Computer Arithmetic
Year:
2005

Citing 0
Cited 4

Design issues and implementations for floating-point divide-add fused

IEEE Transactions on Circuits and Systems II: Express Briefs
Bridge floating-point fused multiply-add design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Speculative hardware/software co-designed floating-point multiply-add fusion

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Ultra-low-power adder stage design for exascale floating point units

ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we propose an architecture for the computation of the double-precision floating-point multiply-add fused (MAF) operation A + (B 脳 C) that permits to compute the floating-point addition with lower latency than floating-point multiplication and MAF. While previous MAF architectures compute the three operations with the same latency, the proposed architecture permits to skip the first pipeline stages, those related with the multiplication B 脳 C, in case of an addition. For instance, for a MAF unit pipelined into three or five stages, the latency of the floating-point addition is reduced to two or three cycles, respectively. To achieve the latency reduction for floating-point addition, the alignment shifter, which in previous organizations is in parallel with the multiplication, is moved so that the multiplication can be bypassed. To avoid that this modification increases the critical path, a double-datapath organization is used, in which the alignment and normalization are in separate paths. Moreover, we use the techniques developed previously of combining the addition and the rounding and of performing the normalization before the addition.