Efficient implementation of 3X for radix-8 encoding

Authors:
Gustavo A. Ruiz;Mercedes Granda
Affiliations:
Department of Electronics and Computers, Facultad de Ciencias, Universidad de Cantabria, Avda. de Los Castros s/n, 39005 Santander, Spain;Department of Electronics and Computers, Facultad de Ciencias, Universidad de Cantabria, Avda. de Los Castros s/n, 39005 Santander, Spain
Venue:
Microelectronics Journal
Year:
2008

Citing 6
Cited 1

CMOS floating-point unit for the S/390 parallel enterprise server G4

IBM Journal of Research and Development - Special issue: IBM S/390 G3 and G4
Speed, power, area, and latency tradeoffs in adaptive FIR filtering for PRML read channels

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Advanced Computer Arithmetic Design

Advanced Computer Arithmetic Design
A Radix-8 CMOS S/390 Multiplier

ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
Pipelined Multiplicative Division with IEEE Rounding

ICCD '03 Proceedings of the 21st International Conference on Computer Design
A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations

IEEE Transactions on Computers

A goldschmidt division method with faster than quadratic convergence

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several commercial processors have selected the radix-8 multiplier architecture to increase their speed, thereby reducing the number of partial products. Radix-8 encoding reduces the digit number length in a signed digit representation. Its performance bottleneck is the generation of the term 3X, also referred to as hard multiple. This term is usually computed by an adding and shifting operation, 3X=2X+X, in a high-speed adder. In a 2X+X addition, close full adders share the same input signal. This property permits simplified algebraic expressions associated to a 3X operation other than in a conventional addition. This paper shows that the 3X operation can be expressed in terms of two signals, H"i and K"i, functionally equivalent to two carries. H"i and K"i are computed in parallel using architectures which lead to an area- and speed-efficient implementation. For the purposes of comparison, implementation based on standard cells of conventional adders has been compared with the proposed circuits based on these H"i and K"i signals. As a result, the delay of the proposed serial scheme is reduced by roughly 67% without additional cost in area, the delay and area of the carry look-ahead scheme is reduced by 20% and 17%, and that of the parallel prefix scheme is reduced by 26% and 46%, respectively.