A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer

Authors:
Gerald R. Morris;Viktor K. Prasanna;Richard D. Anderson
Affiliations:
University of Southern California Los Angeles, CA;University of Southern California Los Angeles, CA;Jackson State University Jackson, MS
Venue:
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Year:
2006

Citing 0
Cited 11

Architectures and APIs: assessing requirements for delivering FPGA performance to applications

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sparse Matrix Computations on Reconfigurable Hardware

Computer
High-Performance Reduction Circuits Using Deeply Pipelined Operators on FPGAs

IEEE Transactions on Parallel and Distributed Systems
A pipelined-loop-compatible architecture and algorithm to reduce variable-length sets of floating-point data on a reconfigurable computer

Journal of Parallel and Distributed Computing
Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer with Application

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
3-D brain MRI tissue classification on FPGAs

IEEE Transactions on Image Processing
FPGA-Array with Bandwidth-Reduction Mechanism for Scalable and Power-Efficient Numerical Simulations Based on Finite Difference Methods

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Iterative-Gradient Based Complex Divider FPGA Core with Dynamic Configurability of Accuracy and Throughput

Journal of Signal Processing Systems
Optimizing memory bandwidth use and performance for matrix-vector multiplication in iterative methods

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Optimising memory bandwidth use for matrix-vector multiplication in iterative methods

ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
A data-driven approach for executing the CG method on reconfigurable high-performance systems

ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Supercomputer companies such as Cray, Silicon Graphics, and SRC Computers now offer reconfigurable computer (RC) systems that combine general-purpose processors (GPPs) with field-programmable gate arrays (FPGAs). The FPGAs can be programmed to become, in effect, application-specific processors. These exciting supercomputers allow end-users to create custom computing architectures aimed at the computationally intensive parts of each problem. This report describes a parameterized, parallelized, deeply pipelined, dual-FPGA, IEEE-754 64-bit floating-point design for accelerating the conjugate gradient (CG) iterative method on an FPGA-augmented RC. The FPGA-based elements are developed via a hybrid approach that uses a high-level language (HLL)-to-hardware description language (HDL) compiler in conjunction with custombuilt, VHDL-based, floating-point components. A reference version of the design is implemented on a contemporary RC. Actual run time performance data compare the FPGAaugmented CG to the software-only version and show that the FPGA-based version runs 1.3 times faster than the software version. Estimates show that the design can achieve a 4 fold speedup on a next-generation RC.