Hardware Implementation of Montgomery's Modular Multiplication Algorithm
IEEE Transactions on Computers
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Handbook of Applied Cryptography
Handbook of Applied Cryptography
Systolic Modular Multiplication
IEEE Transactions on Computers
A Systolic, Linear-Array Multiplier for a Class of Right-Shift Algorithms
IEEE Transactions on Computers
Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli
CT-RSA '02 Proceedings of the The Cryptographer's Track at the RSA Conference on Topics in Cryptology
A Scalable Architecture for Montgomery Multiplication
CHES '99 Proceedings of the First International Workshop on Cryptographic Hardware and Embedded Systems
Montgomery Modular Exponentiation on Reconfigurable Hardware
ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
Hardware Implementation of a Montgomery Modular Multiplier in a Systolic Array
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A Scalable Architecture for Modular Multiplication Based on Montgomery's Algorithm
IEEE Transactions on Computers
A Computer Algorithm for Calculating the Product AB Modulo M
IEEE Transactions on Computers
An efficient implementation of montgomery powering ladder in reconfigurable hardware
SBCCI '10 Proceedings of the 23rd symposium on Integrated circuits and system design
International Journal of Reconfigurable Computing - Special issue on selected papers from the southern programmable logic conference (SPL2010)
Hi-index | 0.00 |
This paper presents a new scalable hardware implementing modular multiplication. A high radix Montgomery multiplication algorithm without final subtraction is used to perform this operation. An alternative proof for the final Montgomery multiplication by 1, removing the condition on the modulus, is given. This hardware fits in any chip area and is able to work with any size of modulus. Unlike other scalable designs only one cell is used. This cell contains standard and well optimized digit multiplier and adder. Time-area trade-offs are also available before hardware synthesis for differents sizes of internal data path. The pipeline architecture of the multiplier component increases the clock frequency and the throughput. Time-area trade-offs are analyzed in order to make the best choice for given time and area constraints. This architecture seems to provide a better time-area compromise than previous scalable hardware.