Principles of digital design
Optimal Circuits for Parallel Multipliers
IEEE Transactions on Computers
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
High-Speed Booth Encoded Parallel Multiplier Design
IEEE Transactions on Computers - Special issue on computer arithmetic
Computer Arithmetic: Principles, Architecture and Design
Computer Arithmetic: Principles, Architecture and Design
Measuring the Performance of Multimedia Instruction Sets
IEEE Transactions on Computers
High-Performance Left-to-Right Array Multiplier Design
ARITH '03 Proceedings of the 16th IEEE Symposium on Computer Arithmetic (ARITH-16'03)
A Fast and Well-Structured Multiplier
DSD '04 Proceedings of the Digital System Design, EUROMICRO Systems
On Multiple Operand Addition of Signed Binary Numbers
IEEE Transactions on Computers
Low-Power Multiplier Design Using a Bypassing Technique
Journal of Signal Processing Systems
A defect/error-tolerant nanosystem architecture for DSP
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Modified booth multipliers with a regular partial product array
IEEE Transactions on Circuits and Systems II: Express Briefs
A new redundant binary booth encoding for fast 2n-bit multiplier design
IEEE Transactions on Circuits and Systems Part I: Regular Papers
High-speed and low-power PID structures for embedded applications
PATMOS'11 Proceedings of the 21st international conference on Integrated circuit and system design: power and timing modeling, optimization, and simulation
A Signed Array Multiplier with Bypassing Logic
Journal of Signal Processing Systems
Hi-index | 14.98 |
The performance of multiplication is crucial for multimedia applications such as 3D graphics and signal processing systems, which depend on the execution of large numbers of multiplications. Previously reported algorithms mainly focused on rapidly reducing the partial products rows down to final sums and carries used for the final accumulation. These techniques mostly rely on circuit optimization and minimization of the critical paths. In this paper, an algorithm to achieve fast multiplication in two's complement representation is presented. Rather than focusing on reducing the partial products rows down to final sums and carries, our approach strives to generate fewer partial products rows. In turn, this influences the speed of the multiplication, even before applying partial products reduction techniques. Fewer partial products rows are produced, thereby lowering the overall operation time. In addition to the speed improvement, our algorithm results in a true diamond-shape for the partial product tree, which is more efficient in terms of implementation. The synthesis results of our multiplication algorithm using the Artisan TSMC 0.13um 1.2-Volt standard-cell library show 13 percent improvement in speed and 14 percent improvement in power savings for 8-bit \times 8-bit multiplications (10 percent and 3 percent, respectively, for 16-bit \times 16-bit multiplications) when compared to conventional multiplication algorithms.