A fast RSA implementation on itanium 2 processor

  • Authors:
  • Kazuyoshi Furukawa;Masahiko Takenaka;Kouichi Itoh

  • Affiliations:
  • FUJITSU LABORATORIES LTD., Kawasaki, Japan;FUJITSU LABORATORIES LTD., Kawasaki, Japan;FUJITSU LABORATORIES LTD., Kawasaki, Japan

  • Venue:
  • ICICS'06 Proceedings of the 8th international conference on Information and Communications Security
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show the fastest implementation result of RSA on Itanium 2. For realizing the fast implementation, we improved the implementation algorithm of Montgomery multiplication proposed by Itoh et al. By using our implementation algorithm, pilepine delay is decreased than previous one on Itanium 2. And we implemented this algorithm with highly optimized for parallel processing. Our code can execute 4 instructions per cycle (At maximum, 6 instructions are executed per cycle on Itanium 2), and its probability of pipeline stalling is just only 5%. Our RSA implementation using this code performs 32 times per second of 4096-bit RSA decryption with CRT on Itanium 2 at 900MHz. As a result, our implementation of RSA is the fastest on Itanium2. This is 3.1 times faster than IPP, a software library developed by Intel, in the best case.