An IEEE Compliant Floating-Point Adder that Conforms with the Pipelined Packet-Forwarding Paradigm

  • Authors:
  • Asger Munk Nielsen;David W. Matula;C. N. Lyu;Guy Even

  • Affiliations:
  • MIPS Denmark Ballerup, Denmark;Southern Methodist Univ., Dallas, TX;Nisham Systems, San Jose, CA;Tel Aviv Univ., Tel Aviv, Israel

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 2000

Quantified Score

Hi-index 14.98

Visualization

Abstract

This paper presents a floating-point addition algorithm and adder pipeline design employing a packet forwarding pipeline paradigm. The packet forwarding format and the proposed algorithms constitute a new paradigm for handling data hazards in deeply pipelined floating-point pipelines. The addition and rounding algorithms employ a four stage execution phase pipeline with each stage suitable for implementation in a short clock period, assuming about 15 logic levels per cycle. The first two cycles are related to addition proper and are the focus of this paper. The last two cycles perform the rounding and have been covered in a paper by Matula and Nielsen [8]. The addition algorithm accepts one operand in a standard binary floating-point format at the start of cycle one. The second operand is represented in the packet forwarding floating-point format, namely, it is divided into four parts: the sign bit, the exponent string, the principal part of the significand, and the carry-round packet. The first three parts of the second operand are input at the start of cycle one and the carry-round packet is input at the start of cycle two. The result is output in two formats that both represent the rounded result as required by the IEEE 754 standard. The result is output in the packet forwarding floating-point format at the end of cycles two and three to allow forwarding with an effective latency of two cycles. The result is also output in standard IEEE 754 binary format at the end of cycle four for retirement to a register. The packet forwarding result is thus available with an effective two cycle latency for forwarding to the start of the adder pipeline or to a cooperating multiplier pipeline accepting a packet forwarding operand. The effective latency of the proposed design is two cycles for successive dependent operations while preserving IEEE 754 binary floating-point compatibility.