The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
A method for obtaining digital signatures and public-key cryptosystems
Communications of the ACM
An FPGA-based performance evaluation of the AES block cipher candidate algorithm finalists
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Systolic Modular Multiplication
IEEE Transactions on Computers
Proceedings of the 7th International Workshop on Security Protocols
GRIP: A Reconfigurable Architecture for Host-Based Gigabit-Rate Packet Processing
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
FPGA-Based Implementation of a Serial RSA Processor
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
High-performance and interoperable security services for mobile environments
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Hi-index | 0.00 |
This paper presents an hardware accelerator which can effectively improve the security and the performance of virtually any RSA cryptographic application. The accelerator integrates two crucial security- and performanceenhancing facilities: an RSA processor and an RSA key-store. An RSA processor is a dedicated hardware block which executes the RSA algorithm. An RSA key-store is a dedicated device for securely storing RSA key-pairs. We chose RSA since it is by far the most widely adopted standard in public key cryptography. We describe the main functional blocks of the hardware accelerator and their interactions, and comment architectural solutions we adopted for maximizing security and performance while minimizing the cost in terms of hardware resources. We then present an FPGA-based implementation of the proposed architecture, which relies on a Commercial Off The Shelf (COTS) programmable hardware board. Finally, we evaluate the system in terms of performance and chip area occupation, and comment the design trade-offs resulting from different levels of parallelism.