The Chinese Remainder Theorem and its application in a high-speed RSA crypto chip

  • Authors:
  • J. Groβchadl

  • Affiliations:
  • -

  • Venue:
  • ACSAC '00 Proceedings of the 16th Annual Computer Security Applications Conference
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The performance of RSA hardware is primarily determined by an efficient implementation of the long-integer modular arithmetic and the ability to utilize the Chinese Remainder Theorem (CRT) for the private key operations. This paper presents the multiplier architecture of the RSA/spl gamma/ crypto-chip, a high-speed hardware accelerator for long-integer modular arithmetic. The RSA/spl gamma/ multiplier datapath is reconfigurable to execute either one 1024-bit modular exponentiation or two 512-bit modular exponentiations in parallel. Another significant characteristic of the multiplier core is its high degree of parallelism. The actual RSA/spl gamma/ prototype contains a 1056/spl times/16-bit word-serial multiplier which is optimized for modular multiplications according to P. Barret's (1987) modular reduction method. The multiplier core is dimensioned for a clock frequency of 200 MHz and requires 227 clock cycles for a single 1024-bit modular multiplication. Pipelining in the highly parallel long-integer unit allows one to achieve a decryption rate of 560 kbit/s for a 1024-bit exponent. In CRT-mode, the multiplier executes two 512-bit modular exponentiations in parallel, which increases the decryption rate by a factor of 3.5 to almost 2 Mbit/s.