Efficient Multiplication of Polynomials on Graphics Hardware

Authors:
Pavel Emeliyanenko
Affiliations:
Max-Planck-Institut für Informatik, Saarbrücken, Germany
Venue:
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Year:
2009

Citing 15
Cited 2

Data parallel algorithms

Communications of the ACM - Special issue on parallelism
Large integer multiplication on hypercubes

Journal of Parallel and Distributed Computing
Modern computer algebra

Modern computer algebra
Fast Transforms: Algorithms, Analyses, Applications

Fast Transforms: Algorithms, Analyses, Applications
Rapid multiplication modulo the sum and difference of highly composite numbers

Mathematics of Computation
The FFT on a GPU

Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Algorithms for Quad-Double Precision Floating Point Arithmetic

ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Algorithms in Real Algebraic Geometry (Algorithms and Computation in Mathematics)

Algorithms in Real Algebraic Geometry (Algorithms and Computation in Mathematics)
A new Mixed Radix Conversion algorithm MRC-II

Journal of Systems Architecture: the EUROMICRO Journal
A Fully Parallel Mixed-Radix Conversion Algorithm for Residue Number Applications

IEEE Transactions on Computers
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
NVIDIA Tesla: A Unified Graphics and Computing Architecture

IEEE Micro
High performance discrete Fourier transforms on graphics processors

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Exploiting the Power of GPUs for Asymmetric Cryptography

CHES '08 Proceeding sof the 10th international workshop on Cryptographic Hardware and Embedded Systems
Toward acceleration of RSA using 3D graphics hardware

Cryptography and Coding'07 Proceedings of the 11th IMA international conference on Cryptography and coding

Modular resultant algorithm for graphics processors

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Towards efficient arithmetic for lattice-based cryptography on reconfigurable hardware

LATINCRYPT'12 Proceedings of the 2nd international conference on Cryptology and Information Security in Latin America

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the algorithm to multiply univariate polynomials with integer coefficients efficiently using the Number Theoretic transform (NTT) on Graphics Processing Units (GPU). The same approach can be used to multiply large integers encoded as polynomials. Our algorithm exploits fused multiply-add capabilities of the graphics hardware. NTT multiplications are executed in parallel for a set of distinct primes followed by reconstruction using the Chinese Remainder theorem (CRT) on the GPU. Our benchmarking experiences show the NTT multiplication performance up to 77 GMul/s. We compared our approach with CPU-based implementations of polynomial and large integer multiplication provided by NTL and GMP libraries.