How to maximize software performance of symmetric primitives on pentium III and 4 processors

  • Authors:
  • Mitsuru Matsui;Sayaka Fukuda

  • Affiliations:
  • Information Technology R&D Center, Mitsubishi Electric Corporation, Kamakura, Kanagawa, Japan;Information Technology R&D Center, Mitsubishi Electric Corporation, Kamakura, Kanagawa, Japan

  • Venue:
  • FSE'05 Proceedings of the 12th international conference on Fast Software Encryption
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses the state-of-the-art software optimization methodology for symmetric cryptographic primitives on Pentium III and 4 processors. We aim at maximizing speed by considering the internal pipeline architecture of these processors. This is the first paper studying an optimization of ciphers on Prescott, a new core of Pentium 4. Our AES program with 128-bit key achieves 251 cycles/block on Pentium 4, which is, to our best knowledge, the fastest implementation of AES on Pentium 4. We also optimize SNOW2.0 keystream generator. Our program of SNOW2.0 for Pentium III runs at the rate of 2.75 íops/cycle, which seems the most efficient code ever made for a real-world cipher primitive. For FOX128 block cipher, we propose a technique for speeding-up by interleaving two independent blocks using a register group separation. Finally we consider fast implementation of SHA512 and Whirlpool, two hash functions with a genuine 64-bit architecture. It will be shown that new SIMD instruction sets introduced in Pentium 4 excellently contribute to fast hashing of SHA512.