A General Proof for Overlapped Multiple-Bit Scanning Multiplications
IEEE Transactions on Computers
Design of the IBM RISC System/6000 floating-point execution unit
IBM Journal of Research and Development
Hard-Wired Multipliers with Encoded Partial Products
IEEE Transactions on Computers
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The floating-point unit of the PowerPC 603e microprocessor
IBM Journal of Research and Development
Design strategies for optimal hybrid final adders in a parallel multiplier
Journal of VLSI Signal Processing Systems - Special issue on VLSI arithmetic and implementations
Design strategies for the final adder in a parallel multiplier
ASILOMAR '95 Proceedings of the 29th Asilomar Conference on Signals, Systems and Computers (2-Volume Set)
Design Strategies for Optimal Multiplier Circuits
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
IEEE Transactions on Computers
Further Reducing the Redundancy of a Notation Over a Minimally Redundant Digit Set
Journal of VLSI Signal Processing Systems
Prospects for Simulated Annealing Algorithms in Automatic Differentiation
SAGA '01 Proceedings of the International Symposium on Stochastic Algorithms: Foundations and Applications
A low cost, multithreaded processing-in-memory system
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Multi-functional floating-point MAF designs with dot product support
Microelectronics Journal
Bridge floating-point fused multiply-add design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Ultra-low-power adder stage design for exascale floating point units
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Hi-index | 14.98 |
Low power, low cost, and high performance factors dictate the design of many microprocessors targeted to the low-power computing market. The floating-point unit occupies a significant percentage of the silicon area in a microprocessor due its wide data bandwidth (for double-precision computations) and the area occupied by the multiply array. For microprocessors designed for portable products, the design-size of the floating-point unit plays an important role in the low cost factor driven by reduced chip area. Some microprocessors have multiply-add fused floating-point units with a reduced multiply array, requiring two passes through the array for operations involving double-precision multiplies. This paper discusses the design complexities around the dual-pass multiply array and its effect on area and performance. Floating-point unit areas and their associated multiply array areas are compared for a single- and dual-pass implementation in a given technology (PowerPC 604eTM and PowerPC 603eTM microprocessors, respectively).