A method for obtaining digital signatures and public-key cryptosystems
Communications of the ACM
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Systolic Modular Multiplication
IEEE Transactions on Computers
The Montgomery Powering Ladder
CHES '02 Revised Papers from the 4th International Workshop on Cryptographic Hardware and Embedded Systems
Simplifying Quotient Determination in High-Radix Modular Multiplication
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
Montgomery Modular Exponentiation on Reconfigurable Hardware
ARITH '99 Proceedings of the 14th IEEE Symposium on Computer Arithmetic
A Scalable Architecture for Modular Multiplication Based on Montgomery's Algorithm
IEEE Transactions on Computers
An Improved Unified Scalable Radix-2 Montgomery Multiplier
ARITH '05 Proceedings of the 17th IEEE Symposium on Computer Arithmetic
Scalable hardware implementing high-radix Montgomery multiplication algorithm
Journal of Systems Architecture: the EUROMICRO Journal
Parallelized radix-4 scalable montgomery multipliers
Proceedings of the 20th annual conference on Integrated circuits and systems design
A unified architecture for a public key cryptographic coprocessor
Journal of Systems Architecture: the EUROMICRO Journal
Amplitude demodulation-based EM analysis of different RSA implementations
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
This paper describes a comparison of two Montgomery modular multiplication architectures: a systolic and a multiplexed. Both implementations target FPGA devices. The modular multiplication is employed in modular exponentiation processes, which are the most important operations of some public-key cryptographic algorithms, including the most popular of them, the RSA. The proposed systolic architecture presents a high-radix implementation with a one-dimensional array of Processing Elements. The multiplexed implementation is a new alternative and is composed of multiplier blocks in parallel with the new simplified Processing Elements, and it provides a pipelined operation mode. We compare the time × area efficiency for both architectures as well as an RSA application. The systolic implementation can run the 1024 bits RSA decryption process in just 3.23 ms, and the multiplexed architecture executes the same operation in 4.36ms, but the second approach saves up to 28% of logical resources. These results are competitive with the state-of-the-art performance.