Optimal Circuits for Parallel Multipliers
IEEE Transactions on Computers
Tight integration of timing-driven synthesis and placement of parallel multiplier circuits
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A timing-driven hybrid-compression algorithm for faster Sum-of-Products
CSS '07 Proceedings of the Fifth IASTED International Conference on Circuits, Signals and Systems
IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special section on 2009 IEEE system-on-chip conference
Synthesis of Adaptable Hybrid Adders for Area Optimization under Timing Constraint
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A just-in-time customizable processor
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.01 |
Multiply-accumulate is an important and expensive operation. It is frequently used in digital signal processing and video/graphics applications. As a result, any improvement in the delay for performing this operation could have a positive impact on clock speed, instruction time and processor performance. In this paper, we show how, by extending our view of a parallel multiplier, we can apply recent innovations in parallel multiplier design to multiply-accumulators. This application results in multiply-accumulators that are as fast as multipliers of the same size (these multipliers have been shown to result in provably optimal delays faster than current designs). This allows a single (optimal multiply-accumulate) circuit to be used for both operations without delay penalty. As a result, multiply-accumulate can be efficiently and effectively implemented as an instruction in RISC CPUs. Additionally, the circuit design reduces the number of devices needed over current fast multiplier designs, so that real estate and power savings also result.