A method for obtaining digital signatures and public-key cryptosystems
Communications of the ACM
Residue Number Systems: Algorithms and Architectures
Residue Number Systems: Algorithms and Architectures
Implementation of RSA Algorithm Based on RNS Montgomery Multiplication
CHES '01 Proceedings of the Third International Workshop on Cryptographic Hardware and Embedded Systems
Modular Multiplication and Base Extensions in Residue Number Systems
ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Exploiting the Power of GPUs for Asymmetric Cryptography
CHES '08 Proceeding sof the 10th international workshop on Cryptographic Hardware and Embedded Systems
An RNS implementation of an Fpelliptic curve point multiplier
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Compact and Flexible Microcoded Elliptic Curve Processor for Reconfigurable Devices
FCCM '09 Proceedings of the 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines
Cox-Rower architecture for fast parallel montgomery multiplication
EUROCRYPT'00 Proceedings of the 19th international conference on Theory and application of cryptographic techniques
A high speed coprocessor for elliptic curve scalar multiplications over Fp
CHES'10 Proceedings of the 12th international conference on Cryptographic hardware and embedded systems
Bitwidth cognizant architecture synthesis of custom hardware accelerators
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Accuracy-Guaranteed Bit-Width Optimization
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
RNS-Based Elliptic Curve Point Multiplication for Massive Parallel Architectures
The Computer Journal
Hi-index | 0.00 |
This article proposes the Computing with the ResidueNumber System (CRNS) framework, which aims at the design automation of accelerators for Modular Arithmetic (MA). The framework provides a comprehensive set of tools ranging from a programming language and respective compiler to back-ends targeting parallel computation platforms such as Graphical Processing Units (GPUs) and reconfigurable hardware. Given an input algorithm described with a high-level programming language, the CRNS can be used to obtain in a few seconds the corresponding optimized Parallel Thread Execution (PTX) program ready to be run on GPUs or the Hardware Description Language (HDL) specification of a fully functional accelerator suitable for reconfigurable hardware and embedded systems. The resulting framework's implementations benefit from the Residue Number System (RNS) arithmetic's parallelization properties in a fully automated way. Designers do not need to be familiar with the mathematical details concerning the employed arithmetic, namely the RNS representation. In order to thoroughly describe and evaluate the proposed framework, experimental results obtained for the supported back-ends (GPU and HDL) are presented targeting the implementation of the modular exponentiation used in the Rivest-Shamir-Adleman (RSA) algorithm and Elliptic Curve (EC) point multiplication. Results suggest competitive latency and throughput with minimum design effort and overcoming all the development issues that arise in the specification and verification of dedicated solutions.