An FPGA implementation and performance evaluation of the Serpent block cipher
FPGA '00 Proceedings of the 2000 ACM/SIGDA eighth international symposium on Field programmable gate arrays
A High-Performance Flexible Architecture for Cryptography
CHES '99 Proceedings of the First International Workshop on Cryptographic Hardware and Embedded Systems
An adaptive cryptographic engine for internet protocol security architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Design of A Novel Asynchronous Reconfigurable Architecture for Cryptographic Applications
IMSCCS '06 Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences - Volume 2 (IMSCCS'06) - Volume 02
Hi-index | 0.00 |
Over the last two decades there has been a considerable increase in the adoption of security solutions in enterprise networks, e-commerce websites and databases management systems. It is important for companies, banks, government departments and any other institution not only to create a secure connection over the ever-expanding networks but also not to slow down their system throughput by the implementation of these security solutions. Most of the communication security is implemented using cryptographic algorithms. Applications for these algorithms are considered compute-intensive applications. Therefore, cryptographic algorithms are implemented in custom hardware seeking higher performance than the software implementation running on general-purpose processors. In this paper we present a new hardware data structure, namely the Shufflebox. This hardware data structure replaces the simple register file or scratch memory needed in any cryptographic engine to store subkeys, S-box values or just temporary results. The Shufflebox is a rectangular array of bits that can store, XOR and rotate all bits in all directions. It allows efficient implementation of these cryptographic-wise critical operations. The hardware implementation that employs this hardware data structure achieves a speedup between 6x and 18x over conventional implementations.