Features of a hardware implementation of an optimal arithmetic
Proc. of the symposium on A new approach to scientific computation
The Area-Time Complexity of Binary Multiplication
Journal of the ACM (JACM)
Computer Arithmetic in Theory and Practice
Computer Arithmetic in Theory and Practice
STOC '79 Proceedings of the eleventh annual ACM symposium on Theory of computing
A proposed standard for binary floating point arthmetic
ACM SIGNUM Newsletter
Systolic Super Summation with Reduced Hardware
IEEE Transactions on Computers
Validated Roundings of Dot Products by Sticky Accumulation
IEEE Transactions on Computers
Hi-index | 14.99 |
A principal limitation in accuracy for scientific computation performed with floating-point arithmetic is due to the computation of repeated sums, such as those that arise in inner products. A systolic super summer of cellular design is proposed for the high-throughput performance of repeated sums of floating-point numbers. The apparatus receives pipelined inputs of streams of summands from one or many sources. The floating-point summands are converted into a fixed-point form by a sieve-like pipelined cellular packet-switching device with signal combining. The emerging fixed-point numbers are then summed in a corresponding network of extremely long accumulators (i.e., super accumulators). At the cell level, the design uses a synchronous model of VLSI. The amount of time the apparatus needs to compute an entire sum depends on the values of summands; at this architectural level, the design is asynchronous. The throughput per unit area of hardware approaches that of a tree network, but without the long wire and signal propagation delay that are intrinsic to tree networks.