A New Version of the Stream Cipher SNOW
SAC '02 Revised Papers from the 9th Annual International Workshop on Selected Areas in Cryptography
Fast Software Implementations of SC2000
ISC '02 Proceedings of the 5th International Conference on Information Security
FOX: a new family of block ciphers
SAC'04 Proceedings of the 11th international conference on Selected Areas in Cryptography
High-speed & Low Area Hardware Architectures of the Whirlpool Hash Function
Journal of VLSI Signal Processing Systems
Sosemanuk, a Fast Software-Oriented Stream Cipher
New Stream Cipher Designs
New AES Software Speed Records
INDOCRYPT '08 Proceedings of the 9th International Conference on Cryptology in India: Progress in Cryptology
Multilane HMAC: security beyond the birthday limit
INDOCRYPT'07 Proceedings of the cryptology 8th international conference on Progress in cryptology
Montgomery's trick and fast implementation of masked AES
AFRICACRYPT'11 Proceedings of the 4th international conference on Progress in cryptology in Africa
Instruction set extensions for efficient AES implementation on 32-bit processors
CHES'06 Proceedings of the 8th international conference on Cryptographic Hardware and Embedded Systems
How far can we go on the x64 processors?
FSE'06 Proceedings of the 13th international conference on Fast Software Encryption
Ensuring fast implementations of symmetric ciphers on the intel pentium 4 and beyond
ACISP'06 Proceedings of the 11th Australasian conference on Information Security and Privacy
Bitslice implementation of AES
CANS'06 Proceedings of the 5th international conference on Cryptology and Network Security
Hi-index | 0.00 |
This paper discusses the state-of-the-art software optimization methodology for symmetric cryptographic primitives on Pentium III and 4 processors. We aim at maximizing speed by considering the internal pipeline architecture of these processors. This is the first paper studying an optimization of ciphers on Prescott, a new core of Pentium 4. Our AES program with 128-bit key achieves 251 cycles/block on Pentium 4, which is, to our best knowledge, the fastest implementation of AES on Pentium 4. We also optimize SNOW2.0 keystream generator. Our program of SNOW2.0 for Pentium III runs at the rate of 2.75 íops/cycle, which seems the most efficient code ever made for a real-world cipher primitive. For FOX128 block cipher, we propose a technique for speeding-up by interleaving two independent blocks using a register group separation. Finally we consider fast implementation of SHA512 and Whirlpool, two hash functions with a genuine 64-bit architecture. It will be shown that new SIMD instruction sets introduced in Pentium 4 excellently contribute to fast hashing of SHA512.