Hardware acceleration of matrix multiplication over small prime finite fields

Authors:
Shane T. Fleming;David B. Thomas
Affiliations:
Imperial College London, London, United Kingdom;Imperial College London, London, United Kingdom
Venue:
ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications
Year:
2013

Citing 3
Cited 0

Matrix multiplication via arithmetic progressions

Journal of Symbolic Computation - Special issue on computational algebraic complexity
A Computational Introduction to Number Theory and Algebra

A Computational Introduction to Number Theory and Algebra
FPGA-Optimised Uniform Random Number Generators Using LUTs and Shift Registers

FPL '10 Proceedings of the 2010 International Conference on Field Programmable Logic and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dense matrix-matrix multiplication over small finite fields is a common operation in many application domains, such as cryptography, random numbers, and error correcting codes. This paper shows that FPGAs have the potential to greatly accelerate this time consuming operation, and in particular that systolic array based approaches are both practical and efficient when using large modern devices. A number of finite-field specific architectural optimisations are introduced, allowing n×n matrices to be processed in O(n) cycles, for matrix sizes up to n=350. Comparison with optimised software implementations on a single-core CPU shows that an FPGA accelerator can achieve between 80x and 700x speed-up over a Virtex-7 XC7V200T for GF(2k), but for GF(3) and larger finite fields can provide practical speed-ups of 1000x or more.