Pipelined Mixed Precision Algorithms on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components

Authors:
Robert Strzodka;Dominik Goddeke
Affiliations:
Stanford University;University of Dortmund, Germany
Venue:
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Year:
2006

Citing 0
Cited 11

Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy

ACM Transactions on Mathematical Software (TOMS)
Performance and accuracy of hardware-oriented native-, emulated-and mixed-precision solvers in FEM simulations

International Journal of Parallel, Emergent and Distributed Systems
Parallel implementation of Cholesky LLT-algorithm in FPGA-based processor

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
VFloat: A Variable Precision Fixed- and Floating-Point Library for Reconfigurable Hardware

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
An error correction solver for linear systems: evaluation of mixed precision implementations

VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A tightly coupled accelerator infrastructure for exact arithmetics

ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Area-efficient architectures for double precision multiplier on FPGA, with run-time-reconfigurable dual single precision support

Microelectronics Journal
Automatically adapting programs for mixed-precision floating-point computation

Proceedings of the 27th international ACM conference on International conference on supercomputing
Evaluation of two formulations of the conjugate gradients method with transactional memory

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

FPGAs are becoming more and more attractive for high precision scientific computations. One of the main problems in efficient resource utilization is the quadratically growing resource usage of multipliers depending on the operand size. Many research efforts have been devoted to the optimization of individual arithmetic and linear algebra operations. In this paper we take a higher level approach and seek to reduce the intermediate computational precision on the algorithmic level by optimizing the accuracy towards the final result of an algorithm. In our case this is the accurate solution of partial differential equations (PDEs). Using the Poisson Problem as a typical PDE example we show that most intermediate operations can be computed with floats or even smaller formats and only very few operations (e.g. 1%) must be performed in double precision to obtain the same accuracy as a full double precision solver. Thus the FPGA can be configured with many parallel float rather than few resource hungry double operations. To achieve this, we adapt the general concept of mixed precision iterative refinement methods to FPGAs and develop a fully pipelined version of the Conjugate Gradient solver. We combine this solver with different iterative refinement schemes and precision combinations to obtain resource efficient mappings of the pipelined algorithm core onto the FPGA.