Generalized Signed-Digit Number Systems: A Unifying Framework for Redundant Number Representations
IEEE Transactions on Computers
The accuracy of floating point summation
SIAM Journal on Scientific Computing
IEEE Transactions on Computers
Computer Arithmetic Algorithms
Computer Arithmetic Algorithms
A New VLSI Vector Arithmetic Coprocessor for the PC
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
The Case for a Redundant Format in Floating Point Arithmetic
ARITH '03 Proceedings of the 16th IEEE Symposium on Computer Arithmetic (ARITH-16'03)
On the design of high performance digital arithmetic units
On the design of high performance digital arithmetic units
Advanced Arithmetic for the Digital Computer: Design of Arithmetic Units
Advanced Arithmetic for the Digital Computer: Design of Arithmetic Units
An FPGA-Based Floating-Point Jacobi Iterative Solver
ISPAN '05 Proceedings of the 8th International Symposium on Parallel Architectures,Algorithms and Networks
Optimistic Parallelization of Floating-Point Accumulation
ARITH '07 Proceedings of the 18th IEEE Symposium on Computer Arithmetic
High-Performance Designs for Linear Algebra Operations on Reconfigurable Hardware
IEEE Transactions on Computers
Accurate Floating-Point Summation Part I: Faithful Rounding
SIAM Journal on Scientific Computing
Automatic Generation of Modular Multipliers for FPGA Applications
IEEE Transactions on Computers
A Floating-Point Unit for 4D Vector Inner Product with Reduced Latency
IEEE Transactions on Computers
Multi-operand Floating-Point Addition
ARITH '09 Proceedings of the 2009 19th IEEE Symposium on Computer Arithmetic
FPGA Floating Point Datapath Compiler
FCCM '09 Proceedings of the 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines
Ultimately Fast Accurate Summation
SIAM Journal on Scientific Computing
IEEE Transactions on Computers
Redundant-Digit Floating-Point Addition Scheme Based on a Stored Rounding Value
IEEE Transactions on Computers
Handbook of Floating-Point Arithmetic
Handbook of Floating-Point Arithmetic
Performing Floating-Point Accumulation on a Modern FPGA in Single and Double Precision
FCCM '10 Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
Fast, Efficient Floating-Point Adders and Multipliers for FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
VFloat: A Variable Precision Fixed- and Floating-Point Library for Reconfigurable Hardware
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Characterization of Fixed and Reconfigurable Multi-Core Devices for Application Acceleration
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Molecular Dynamics Simulations on High-Performance Reconfigurable Computing Systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
FPL '10 Proceedings of the 2010 International Conference on Field Programmable Logic and Applications
Synthesis of Floating-Point Addition Clusters on FPGAs Using Carry-Save Arithmetic
FPL '10 Proceedings of the 2010 International Conference on Field Programmable Logic and Applications
Accelerating Machine-Learning Algorithms on FPGAs using Pattern-Based Decomposition
Journal of Signal Processing Systems
Designing Custom Arithmetic Data Paths with FloPoCo
IEEE Design & Test
FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture
IEEE Transactions on Computers
FFT Implementation with Fused Floating-Point Operations
IEEE Transactions on Computers
Accelerating Matrix Operations with Improved Deeply Pipelined Vector Reduction
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Advances in semiconductor technology brings to the market incredibly dense devices, capable of handling tens to hundreds floating-point operators on a single chip; so do the latest field programmable gate arrays (FPGAs). In order to alleviate the complexity of resorting to these devices in computationally intensive applications, this article proposes hardware schemes for the realization of addition-related floating-point operators based on the self-alignment technique (SAT). The article demonstrates that the schemes guarantee an accuracy as if summation was computed accurately in the precision of operator’s internal mantissa, then faithfully rounded to working precision. To achieve such performance, the article adopts the redundant high radix carry-save (HRCS) format for the rapid addition of wide mantissas. Implementation results show that combining the SAT and the HRCS format allows the implementation of complex operators with reduced area and latency, more so when a fused-path approach is adopted. The article also proposes a new hardware operator for performing endomorphic HRCS additions and presents a new technique for speeding up the conversion from the redundant HRCS to a conventional binary format.