Efficient Software Implementation of AES on 32-Bit Platforms
CHES '02 Revised Papers from the 4th International Workshop on Cryptographic Hardware and Embedded Systems
CARDIS '98 Proceedings of the The International Conference on Smart Card Research and Applications
Efficient block ciphers for smartcards
WOST'99 Proceedings of the USENIX Workshop on Smartcard Technology on USENIX Workshop on Smartcard Technology
An integer linear programming approach for identifying instruction-set extensions
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
AES Efficient implementation for extensible firmware interface
MS'06 Proceedings of the 17th IASTED international conference on Modelling and simulation
New AES Software Speed Records
INDOCRYPT '08 Proceedings of the 9th International Conference on Cryptology in India: Progress in Cryptology
International Journal of High Performance Systems Architecture
FSE'10 Proceedings of the 17th international conference on Fast software encryption
Fast software implementation of AES-CCM on multiprocessors
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part II
AES software implementations on ARM7TDMI
INDOCRYPT'06 Proceedings of the 7th international conference on Cryptology in India
Hi-index | 0.00 |
The Advanced Encryption Standard (AES) contest, started by the U.S. National Institute of Standards and Technology (NIST), saw the Rijndael [13] algorithm as its winner [11]. Although the AES is fully defined in terms of functionality, it requires best exploitation of architectural parameters in order to reach the optimum performance on specific architectures. Our work concentrates on ARM cores [1] widely used in the embedded industry. Most promising implementation choices for the common ARM Instruction Set Architecture (ISA) are identified, and a new implementation for the linear mixing layer is proposed. The performance improvement over current implementations is demonstrated by a case study on the Intel StrongARM SA-1110 Microprocessor [2]. Further improvements based on exploitation of memory hierarchies are also described, and the corresponding performance figures are presented.