SSE Implementation of Multivariate PKCs on Modern x86 CPUs

  • Authors:
  • Anna Inn-Tung Chen;Ming-Shing Chen;Tien-Ren Chen;Chen-Mou Cheng;Jintai Ding;Eric Li-Hsiang Kuo;Frost Yu-Shuang Lee;Bo-Yin Yang

  • Affiliations:
  • National Taiwan University, Taipei, Taiwan;Academia Sinica, Taipei, Taiwan;Academia Sinica, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan;University of Cincinnati, Cincinnati, USA;Academia Sinica, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan;Academia Sinica, Taipei, Taiwan

  • Venue:
  • CHES '09 Proceedings of the 11th International Workshop on Cryptographic Hardware and Embedded Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multivariate Public Key Cryptosystems (MPKCs) are often touted as future-proofing against Quantum Computers. It also has been known for efficiency compared to "traditional" alternatives. However, this advantage seems to erode with the increase of arithmetic resources in modern CPUs and improved algorithms, especially with respect to Elliptic Curve Cryptography (ECC). In this paper, we show that hardware advances do not just favor ECC. Modern commodity CPUs also have many small integer arithmetic/logic resources, embodied by SSE2 or other vector instruction sets, that are useful for MPKCs. In particular, Intel's SSSE3 instructions can speed up both public and private maps over prior software implementations of Rainbow-type systems up to 4×. Furthermore, MPKCs over fields of relatively small odd prime characteristics can exploit SSE2 instructions, supported by most modern 64-bit Intel and AMD CPUs. For example, Rainbow over ${\mathbb F}_{31}$ can be up to 2× faster than prior implementations of similarly-sized systems over ${\mathbb F}_{16}$. Here a key advance is in using Wiedemann (as opposed to Gauss) solvers to invert the small linear systems in the central maps. We explain the techniques and design choices in implementing our chosen MPKC instances over fields such as ${\mathbb F}_{31}$, ${\mathbb F}_{16}$ and ${\mathbb F}_{256}$. We believe that our results can easily carry over to modern FPGAs, which often contain a large number of small multipliers, usable by odd-field MPKCs.